Previous studies have identified a negative potential in the event-related potential (ERP), the error-related negativity (ERN), which is claimed to be triggered by a deviation from a reward expectation. Furthermore, this negativity is related to shifts in risk taking, strategic behavioral adjustments, and inhibition. We used a computer Blackjack gambling task to further examine the process associated with the ERN. Our findings are in line with the view that the ERN process is related to the degree of reward expectation. Furthermore, increased ERN amplitude is associated with the negative evaluation of ongoing decisions, and the amplitude of the ERN is directly related to risk-taking and decision-making behavior. However, the findings suggest that an explanation exclusively based on the deviation from a reward expectation may be insufficient and that the intention of the participants and the importance of a negative event for learning and behavioral change are crucial to the understanding of ERN phenomena.
In the early 1990s, Falkenstein and others (1991) and Gehring and others (1993) independently described a negative deflection in the event-related potential response to errors in reaction time tasks. This so called “error-related negativity” (ERN, Gehring and others 1993) or “error negativity” (Falkenstein and others 1991) is characterized by a negative peak deflection that becomes maximal around 80 ms after the response. According to electromyography (EMG) recordings, it starts at the time of response initiation. For this reason, this ERN is referred to as the “response-locked ERN.”
The association of the response-locked ERN with remedial actions was described by Gehring and others (1993). In this study, larger ERNs were related to a decrease in error force (possibly the attempt of an inhibition of the erroneous response), an increase in likelihood of error corrections, and slower responses on the following trial. The slowing in subsequent trials suggests some kind of adaptation for future events, which reflects general inhibition or a strategic change in behavior. In another study by Scheffers and others (1996), the ERN was also present after errors in a NoGo task (commission errors/unsuccessful inhibition of a response), in which no active error correction was possible. This may suggest that remedial action is not the only function of the ERN process and that the remedial effects of the ERN process may be inhibitory.
Further studies by Miltner and others (1997) and others (Mars and others 2004; Nieuwenhuis and others 2005) suggested that an ERN, the “feedback ERN”, might also be elicited by negative performance feedback (peaking at about 250 ms, Holroyd and Coles 2002). Recently, Gehring and Willoughby (2002) used a gambling paradigm and described a negative deflection in the ERP with a peak at 265 ms, which they termed “medial frontal negativity” (MFN). This negative deflection was elicited by negative feedback (the loss of money) as compared with positive feedback (the gain of money), when the choice of the participant did not predict the outcome. According to Gehring and Willoughby (2002), the MFN may represent a different component than the ERN. However, in a subsequent replication of the Gehring and Willoughby (2002) study, Nieuwenhuis, Yeung, and others (2004) found that the MFN following utilitarian feedback of losses is similar to the ERN to performance feedback. The salience of the feedback determines which factor primarily elicits the ERN. Thus, the subjective importance of each kind of feedback seems to be relevant for ERN generation.
In another recent study, Holroyd, Larsen, and others (2004) described a pseudo trial-and-error learning task with monetary incentives. The relative outcome of the trial in relation to different possible outcomes (e.g., 0 vs. 2.5 or 5 cent gain) mainly determined the amplitude of the ERN rather than the absolute outcome (gain or loss of money). For example, a win of 2.5 cents may be considered as negative relative to a win of 5 cents but positive relative to no win at all. The findings of Holroyd, Larsen, and others (2004) indicate that the ERN is not sensitive to the absolute utilitarian value of the feedback. Thus, these results corroborate the view that the relative or subjective relevance of the feedback moderates the amplitude of the ERN (for a similar finding on subjective importance, see Yeung and others 2005). In another study, Holroyd and others (2003) showed that the same omission of a reward of 5 cents elicited a larger ERN in an experimental condition where losses were infrequent rather than frequent. They argued that a stronger deviation from the expectation of the mean reward in the infrequent condition might have been responsible for the larger ERN. In summary, the above findings are compatible with the idea that a negative deviation from a reward expectation results in an ERN (for a more detailed discussion, see Holroyd and Coles 2002).
In several recent publications, Holroyd and others (Holroyd and Coles 2002; Holroyd, Coles, and others 2002; Holroyd and others 2003; Nieuwenhuis, Holroyd, and others 2004; Holroyd and others 2005) have argued that the different negativities in response to errors and negative feedback are related to a neural mechanism of reinforcement learning (RL). According to RL theory, the basal ganglia continuously evaluate the outcome of ongoing behaviors or internal and external events against participants' expectations. If the outcome of an event is better than expected, there is an increase in phasic activity of midbrain dopaminergic neurons, whereas a decrease in activity of these neurons is induced if the event turns out to be worse than expected. These processes are closely linked to the detection of negative temporal difference (TD) errors according to mathematical formulations of learning theory (e.g., Sutton and Barto 1981). Such negative time difference errors indicate the decline of goal or reward expectancy. The decrease of dopaminergic activity as a function of violated expectancy is subsequently conveyed to the anterior cingulate cortex (ACC) and the frontal cortex where apical dendrites of motor neurons become disinhibited and the ERN is generated. Several studies confirm the idea that the ERN is generated in the ACC and in nearby medial frontal cortical areas (Dehaene and others 1994; Miltner and others 1997; Carter and others 1998; Holroyd and others 1998; Gehring and others 2000; Kiehl and others 2000; Miltner and others 2003; Holroyd and others 2004; Ullsperger and von Cramon 2001; but see Nieuwenhuis and others 2005).
In a recent study, Yeung and Sanfey (2004) reported on a gambling task where participants had to decide between large and small amounts of money that turned out to be either wins or losses. They reported on differences in risk-taking behavior after monetary losses, which correlated with feedback ERN amplitude. In particular, participants who showed greater ERNs to losses (small and large) yielded stronger shifts toward risk-taking behavior than participants with smaller ERN amplitudes. The latter finding might suggest a role for the ERN in strategic adjustments of behavior and decision making. In another study, Yasuda and others (2004) also described associations between ERN amplitude and shifts in risk-taking behavior in a card-guessing paradigm. It is important to note that feedback in the latter studies was given pseudorandomly. Thus, shifts in risk-taking behavior did not have any consequences for the outcome. However, a direct RL theory prediction of behavioral changes is based on the idea that feedback is contingent upon behavior. In such a context, increased ERN amplitudes associated with one specific decision or behavior should lead to avoidance of this decision or behavior in the future.
The present study was designed to explore RL theory in the context of a realistic gambling paradigm, in which measures of the ERN were derived for various task events. It was hypothesized that a decline in reward or goal expectancy would be positively related to the ERN amplitude. In particular, the ERN amplitude to feedback indicating a loss should be larger in situations with stronger rather than weaker reward expectancy. In other words, more unexpected negative outcomes should result in greater ERN activity. Furthermore, we hypothesized that interindividual differences in the ERN amplitude would be related to interindividual differences in risk-taking or decision-making behavior during the gambling task. In particular, we expected that participants who showed greater feedback or response-locked ERN amplitudes to a negative outcome resulting from a decision would avoid repeating similar decisions across the experiment.
Participants were recruited from the student population of the Friedrich Schiller Universität. A total of 18 right-handed participants took part in the experiment (16 females; mean age: 20.89 years, standard deviation [SD] = ±2.4 years, range 18–26 years). All were paid 6 Euro per hour for participation plus an extra bonus that varied between 0 and 8.60 EUR according to the participant's performance in a German version of the Blackjack gambling task.
Prior to the experiment, participants were informed that the purpose of the experiment was to investigate brain waves during gambling. After receiving verbal instruction about the basic rules of the game, and after performing 25 practice trials, participants gave written consent for participation in the experiment.
The game consisted of a German version of “Blackjack” called “Seventeen and Four.” The game is usually played with the following cards of each suit (points of each card are presented in brackets): Ace (11), King (4), Queen (3), Jack (2), Ten (10), Nine (9), Eight (8), and Seven (7). To make the symbols of the different cards comparable, a new type of card displaying 11 small symbols of the card's suit replaced the Ace. Similarly, the King was replaced by a Four, the Queen by a Three, and the Jack by a Two. The total card set thus was composed of the following cards: 2, 3, 4, 7, 8, 9, 10, and 11. As in Blackjack, the goal of the game was to get 21 points or to approach 21 points as closely as possible by successively drawing single cards from the bank but to avoid getting over 21 points. In the present experiment, a computer simulated the opponent. Each game started with the simultaneous presentation of 2 cards for the player and 2 cards for the opponent with the value of the starting points of each pair of cards varying between 11 and 21. Furthermore, these values were balanced between player and opponent across the experiment. The cards of the player were depicted with the front side up on the left side of the horizontal midline of a video screen positioned in front of the participant, and cards of the opponent were presented with the backside up at a small distance above the cards of the player (see Fig. 1). After this opening, players were presented with a third card to the right of the first 2 cards with face down. Immediately after card presentation, participants were prompted by a tone to either accept (hit) or reject (stay) this card. The maximum time for participant's decision was 1000 ms. If the participant chose a hit, the card's back side turned green, and then the front side of the chosen card was displayed. Then, the next card was offered, again with face down, and the participant was prompted acoustically to either accept or reject this new card. If the participant rejected this next card (or if card points had already exceeded 21 points) the backside of this card turned red. Then the game continued with the opponent's turn. The strategy of the opponent was set to hit always at 14 or lower points and to stay whenever the sum of the cards was 15 or higher. At the end of the trial, the hand of the opponent was shown on the screen by turning all of the opponent's cards from the back to its front sides. At the same time, feedback was given to the player indicating whether he/she had won or lost on the trial by presenting the words “won” (“gewonnen”) or “lost” (“verloren”) on the screen. Finally, the next trial started 2300 ms after feedback presentation by showing the initial cards for the next game trial. For further information about the exact timing of card presentations, the succession of cards, and participant's response deadline see Figure 2.
The experiment consisted of 440 single game trials. According to the strategy played, participants received an average bonus of 2.76 € (SD = ±2.1) for successful trials. After the first part of the experiment, participants were invited to participate in a second experiment where they observed someone else playing a similar card game. Results of this part of the experiment will be presented elsewhere.
Behavioral data were analyzed using item response models (see below) and behavioral changes were parameterized. The number of changes in decision-making behavior after a loss from sit to hit and vice versa were separately counted for each specific score (e.g., 15 points). These data were aggregated for correlation analyses with ERN activity across scores between 15 and 20 in order to provide 2 estimates of behavioral changes: 1) changes toward higher risk (sit to hit) after losses with a sit decision and 2) changes toward lower risk (hit to sit) after losses with a hit decision.
Electroencephalography (EEG) Recording and Quantification
Participants were seated individually in an electrically shielded, dimly lit EEG cabin and electrodes were applied for the measurement of electrooculogram (EOG) and EEG. EEG was recorded from 61 electrode sites. Montage of electrodes was realized by the Easy-Cap electrode system (Falk Minow Services, Munich, Germany) and included all electrodes according to the 10–10 system plus both earlobes (A1, A2). The remaining electrodes were interspaced at equal distances between these electrodes. All sites were referenced to the vertex (Cz). A bipolar horizontal EOG was recorded from the epicanthus of each eye, and a bipolar vertical EOG was recorded from supra- and infraorbital positions of the left eye. The EEG and the EOG were recorded with Ag/AgCl electrodes. All electrode sites were cleaned with alcohol and gently abraded prior to electrode fixation in order to keep the impedances of electrodes below 5 kOhm, and the differences in impedance between homologous sites below 1 kOhm. EEG and EOG were amplified with a 64 channel AC amplifier (input impedance: 10 MOhm). Band-pass was set to 0.05–100 Hz (−12 dB/octave roll-off); the signals were digitized online at 500 Hz and stored to hard disk for later analyses.
After data acquisition, EOG and EEG recordings were subjected to an off-line ocular correction and artifact procedure performed with the Vision Analyzer software (Version 1.04; BrainProducts GmbH, Munich, Germany). Trials with responses of participants above 1000 ms were discarded from all analyses. Blinks and eye movements were corrected according to a method suggested by Gratton and others (1983). The continuous EOG and EEG recordings were visually inspected with a semiautomatic artifact rejection procedure, and each trial of the EEG data with major artifacts was rejected for this and all other channels. Then data were filtered (pass band of 1–20 Hz) and baseline corrected. For response-locked averages, the baseline was represented by the average activity between 200 and 100 ms before the participant's response and, for stimulus-locked (or feedback locked) averages, the average activity of the last 100 ms prestimulus (presentation of cards) was used. Finally, EEG waveforms were averaged separately for each participant, each experimental condition, and each electrode. For the present report, the statistical analysis of EEG data will be restricted to the following 5 channels: Fz, FCz, Cz, CPz, and Pz all rereferenced to linked earlobes. The restriction of the electrode set in the statistical analyses was based on the hypothesis that a frontocentral midline maximum was expected for the ERN. For a more detailed topographical analysis, we used the brain electric source analysis (BESA) procedure (Berg and Scherg 1994) including all 61 recorded channels. We analyzed the ERN difference waveforms in a time window from zero to peak according to the maxima of the ERN. A principal component analysis was performed, and a first dipole was adjusted using a single vertically oriented starting position at the center of the head (see Gehring and others 2000). Subsequently, a second dipole was included and adjusted testing several different starting positions (see Miltner and others 1997).
Table 1 summarizes the empirical probabilities of participants winning against the opponent when they stopped hitting at scores between 11 and 21. As the table shows, participants were more likely to win when they sat with high total scores (19/20/21 scores) and more likely to lose when they sat with low total scores (11–14).
|p (win | sit)||0.04||0.16||0.10||0.10||0.33||0.31||0.54||0.52||0.78||0.85||1.00|
|p (win | hit)||0.50||0.43||0.36||0.29||0.23||0.27||0.33||0.23||0.13||0.00||0.00|
|p (win | hit + sum < 22)||0.58||0.57||0.58||0.58||0.61||0.72||0.88||0.93||1.00||0.00||0.00|
|p (win | sit)||0.04||0.16||0.10||0.10||0.33||0.31||0.54||0.52||0.78||0.85||1.00|
|p (win | hit)||0.50||0.43||0.36||0.29||0.23||0.27||0.33||0.23||0.13||0.00||0.00|
|p (win | hit + sum < 22)||0.58||0.57||0.58||0.58||0.61||0.72||0.88||0.93||1.00||0.00||0.00|
Note: p(win | sit) denotes the empirical probability of winning given a sit decision for each score. All other values have been estimated mathematically using the latter empirical probabilities as a basis. p (win | hit) denotes the probability of winning given a hit decision and given the strategy of stopping above 16 at decisions later in the trial. p (win | hit + sum < 22) denotes the conditional probability of winning given that a hit decision at each score leads to a final score of below 22.
The participants' average decision-making behavior is presented in Figure 3. As indicated by the graph, the probability of taking another card decreased as the player's current total points increased (11–21). The latter relation may be best described by a logistic function. Logistic response patterns have been described in detail by item response theory (see Hambleton and Swaminathan 1985; Fischer and Molenaar 1995), which suggests that the probability of a binary behavioral response to a test item (in the present case: a hit or a stay) is a logistic function of each participant's ability to successfully solve the test (in the present case, it may be termed degree of risk-taking behavior) and the difficulty of the item (here: the degree of risk when taking another card at each current score, which—across each single game—increased from 11 to 21). The behavioral data were analyzed according to models of item response theory, which provide a parameter for the degree of risk taking of each participant (indicating the score for which the probability of accepting a hit or a stay was equal to 0.5). This parameter may be termed risk threshold. In the present study, it ranged from 14.61 to 17.01 (mean = 15.68, SD = 0.60). A correlation analysis revealed that the risk-taking parameter was highly predictive for the amount of money participants won. The higher the risk threshold, the less money participants won (r = −0.57, P = 0.013). Thus, unlike other gambling paradigms used in ERN research, the reward was contingent upon participant's decision-making behavior.
First, stimulus-locked EEG averages to the feedback will be presented to determine whether our data replicate previous findings for gambling tasks. Subsequently, we test whether feedback ERN amplitude is related to the probability of winning given particular total scores (see Table 1). Next, the timing of evaluative processes is analyzed in more detail by investigating stimulus-locked ERPs following the presentation of the third and fourth card before presentation of feedback. The question here is whether the system already implicitly evaluates the likely outcome of a trial (win or lose) before receiving “explicit” feedback about the results of the trial. Then we will determine whether this evaluation is different when the risk of loss following acceptance of an additional card is high rather than low. In addition, we examine response-locked ERPs following participants' key presses that indicate acceptance of an additional card. This analysis reveals whether the implicit evaluative process transfers to the postresponse period between participants' acceptance of a card (i.e., after confirming a hit/stay by a key press) and the explicit feedback about the outcome of his/her decision.
Do losses result in larger feedback ERNs than wins? The stimulus-locked mean ERP waveforms at the 5 electrodes for wins and losses and the corresponding difference wave (losses minus wins) are presented in Figure 4a,b. Mean ERP amplitudes between 300 and 350 ms following feedback for wins and losses were calculated for each participant and electrode. An analysis of variance (ANOVA) of electrodes (all 5 electrodes) by feedback (wins vs. losses) revealed a significant main effect of feedback, F1,17 = 5.19, P = 0.036. The amplitudes were more negative (i.e., less positive) for losses than for wins (see Table 2) with most negative amplitudes at electrode Fz. This result replicates previous findings on feedback negativity in response to monetary losses and implies increased ERN activity in response to negative feedback.
|Region||Period and condition|
|Kind of feedback||Points with additional card||Difference before card||Points before hit decision|
|Region||Period and condition|
|Kind of feedback||Points with additional card||Difference before card||Points before hit decision|
Note: Mean (above) and SD (below) for frontal (Fz), frontaocentral (FCz), central (Cz), centroparietal (CPz), and parietal (Pz) positions.
Is loss-related ERN amplitude related to the probability of winning as reflected in the final score when participants stick? For this analysis, the feedback ERN was quantified as the mean amplitude between 300 and 350 ms of the difference wave of losses minus wins at 3 levels of total score: “15/16,” “17/18,” and “19/20.” One participant had to be excluded from this analysis because there were not enough trials in the “15/16” condition due to the participant's risky strategy. The ANOVA with factors electrodes (5) and total score (15/16 vs. 17/18 vs. 19/20) revealed a significant main effect for total points, F2,32 = 3.32, P = 0.049. According to a significant linear trend in a post hoc contrast, F1,16 = 5.71, P = 0.029, the amplitudes with a maximum at FCz became increasingly more negative from total score “15/16” to total score “17/18” and total score “19/20” (i.e., highest probability of reward; see Table 2). Figure 4c–e presents the difference waveforms between losses and wins for all 3 levels of total scores and depicts the topographic map for the score “19/20.” The source analysis of the difference waveform revealed a dipole in the medial frontal cortex for the feedback ERN to scores of 19 and 20 (see Figs 7a and 4f). For the conditions with lower scores (15/16 and 17/18), the dipole analysis did not reveal a medial frontal dipole, which is likely due to the markedly reduced ERN activity in these conditions.
It is important to note that there were no significant differences between the negativities in the stimulus-locked ERP to winnings at 15/16, 17/18, and 19/20 (pairwise comparisons: P values > 0.286). In addition, a further analysis of difference waves (losses minus wins) revealed that the difference wave of unexpected (improbable) losses (19/20) minus improbable wins (15/16) was significantly more negative than the difference wave of probable losses (15/16) minus probable wins (19/20), F1,16 = 12.80, P = 0.003. The latter results imply that the unexpectedness of an event in general does not trigger an ERN. Taken together, these findings suggest that the greater the reward expectation (probability of winning) the higher the feedback ERN amplitude after a loss.
Responses to an Additional Card
Do ERPs reflect an implicit evaluation of the likely outcome of a trial before feedback is presented? To address this question, we examined the stimulus-locked ERPs following presentation of the third and fourth card. We first analyzed whether brain electrical activities differed when participants' total scores were above 21 (a loss) as compared with conditions where participants' total scores stayed below 22 (good result). The respective mean ERP waveforms at the 5 electrodes and the difference wave between both conditions are presented in Figure 5a,b. The mean amplitude was calculated for each participant and each condition in a time window from 250 to 350 ms following the presentation of the third or fourth card. The ANOVA of electrodes (5) by result (good vs. bad) revealed a significant interaction of electrodes by result, F4,68 = 14.37, P < 0.001, indicating that the amplitudes in response to the presentation of a card resulting in a score above 21 were relatively more negative at frontal, frontocentral, and central electrodes than to a card for which the total scores stayed below 22 points (see Table 2 and Fig. 5b). This finding reveals that the feedback ERN activity is already increased in response to an outcome that indicates a subsequent loss.
Is this negativity affected by the amount of risk in taking the third or fourth card? Here, the feedback ERN was quantified as the mean amplitude between 250 and 350 ms of the difference wave between losses and good results at 2 comparable (with respect to the range of possible outcomes) levels of starting scores: “13/14” (low risk) and “15/16” (medium risk). Figure 5c,d presents the difference waveforms. The ANOVA with electrodes (5) and starting score (“13/14” vs. “15/16”) revealed a significant interaction of electrodes and starting score, F4,68 = 6.29, P = 0.011. The amplitudes with a maximum at FCz (see Fig. 5e for a topographic map for the level “15/16”) were more negative for losses following relatively more risky decisions (i.e., medium risk, “15/16,” see Table 2) than following relatively less risky decisions (i.e., low risk, “13/14,” see Table 2). The source analysis of the difference waveform revealed a dipole in the medial frontal cortex for the medium-risk condition (see Fig. 7b). For the low-risk condition, the dipole analysis did not reveal a medial frontal dipole, which is likely due to the markedly reduced ERN activity in this condition. Thus, this result indicates greater feedback ERN activity to an outcome indicating a subsequent loss after more risky hit decisions.
Responses to Decision
In this section, we consider response-locked ERPs to participants' key presses indicating acceptance of a third or fourth card (hit). According to the mean decision criterion of 15.68 (see Fig. 3), hit decisions above 16 were classified as high-risk hit decisions, hit decisions of 16 as medium-risk, and hit decisions under 16 as low-risk hit decisions. For this analysis, 2 participants had to be excluded because they did not show enough trials with high-risk hit decisions (hits following scores over 16). The ANOVA with the factors electrodes (5) by risk level (<16, =16, >16) of average ERP amplitudes within the window of 50–100 ms post response revealed a significant main effect of risk level, F2,30 = 3.81, P = 0.034. Response-locked ERN amplitudes after high-risk hit decisions were more negative than ERP amplitudes in response to both medium- and low-risk hit decisions, respectively (see Table 2). Figure 6a,b displays the ERPs and difference waveforms for trials with high-risk hit decisions minus trials with low- and medium-risk hit decisions. This difference waveform reveals a clear negative peak after 85 ms with a frontocentral maximum (see Fig. 6c). The source analysis of the difference waveform revealed a dipole in the medial frontal cortex (see Fig. 7c). Thus, response-locked ERN activity is increased after high-risk hit decisions, indicating that the ERN process is sensitive to an early evaluation of such decisions.
Additional analyses addressed the relation between the risk threshold of participants and ERN peak amplitude as revealed in the difference waveforms. A first analysis revealed no significant relation between difference amplitudes to subsequently drawn cards or feedback and the risk threshold. A further analysis revealed a significant correlation between the risk threshold and the response-locked ERN peak amplitude of the difference wave (60 ms search window centered at 85 ms, average +/−10 ms) of hit decisions with 17 scores and higher minus hit decisions with 16 scores and lower, r = 0.43, P < 0.05. Compared with low-risk, cautious participants, high-risk participants showed smaller response-locked ERN amplitudes to high-risk hit decisions. Across the course of the experiment, cautious participants tended to avoid high-risk hit decisions as compared with participants with higher risk thresholds. This suggests a direct relationship between the negative evaluation of certain decisions as manifested in the ERN and the risk threshold (i.e., the decision-making behavior of participants). This is in line with the idea that the ERN process is a signal of the negative evaluation of previous decisions, which then influences decision-making behavior. Participants with larger response-locked ERN amplitudes in response to high-risk hit decisions tended to avoid these decisions across the experiment.
A further correlation analysis between the number of trials in the relevant high-risk hit condition, the risk threshold, and the ERN amplitude made sure that the former effect was not due to an effect of unexpectedness or rareness on the ERN. There was no significant correlation between the ERN amplitude and the number of trials, (r = 0.29, P = 0.28). In addition—as may be expected—there was a highly significant correlation between the risk threshold and the number of trials (r = 0.88). If the variance of the risk threshold parameter had been eliminated from the number of trials, then the relation between the residual of the number of trials and ERN amplitude became negative (−0.18), which corroborates the view that the finding of a relationship between the ERN and risky behavior is not due to unexpectedness or rareness.
Subsequent analyses for behavioral changes revealed a significant relation between the behavioral change after a loss from hit to sit decisions and the difference of the ERN amplitudes (r = −0.57, P = 0.007) to bad outcomes after medium-risk hit decisions (15/16) as compared with low-risk decisions (13/14). Participants with relative greater ERN amplitudes to a bad additional card (a bust) at medium risk as compared with low risk showed a stronger tendency to change to more cautious decisions. Furthermore, there were significant correlations between behavioral changes after a loss from sit to hit decisions and feedback-related ERN amplitudes at final feedback for the difference wave at scores of 15/16 (r = −0.45, P = 0.034) and for the difference wave at 17/18 (r = −0.53, P = 0.013). For scores of 19/20, the relation was not significant (r = −0.17, P = 0.258). Thus, participants exhibiting increased feedback-related ERN amplitudes to losses with 15/16 and 17/18 showed a stronger tendency to change to more risky decisions.
The ERP results for the feedback about the outcome of the present Seventeen and Four card game clearly replicate previous findings of gambling paradigms, namely, that the feedback ERN amplitude to losses is greater than to wins (e.g., Gehring and Willoughby 2002; Nieuwenhuis, Yeung, and others 2004; Yasuda and others 2004; Yeung and Sanfey 2004). Furthermore, the ERN difference wave to losses minus wins increased with increasing total scores (from 15 and 16 over 17 and 18 to 19 and 20) indicating that the ERN increased with the degree to which a loss was unexpected. This effect could not be attributed to the unexpectedness of the event itself (cf., Holroyd 2004). Because the expectation for a win also increases with the total score, the present observations are in line with the proposal that the ERN process may code a negative TD error during learning. In the case of a loss, the current reward expectation drops to zero when negative feedback is delivered. As a result, a situation that leads to a greater reward expectation (e.g., a higher probability of winning with 19/20, see Table 1) will have a different outcome when negative feedback is delivered as compared with a situation with a lower reward expectation (e.g., lower probability to win with 15/16). There will be a larger decrease in reward expectation and therefore a larger TD error.
Our data also suggest that an evaluation of reward expectation occurs not only at the time of feedback, but is also present when participants receive outcome relevant information (in the form of a third or fourth card). A “bad next card” elicits more negative ERP amplitudes than a “good next card.” The analysis of the ERPs averaged to decision-making responses further suggests that evaluative processes take place even before participants receive such outcome relevant information. The results revealed increased response-locked ERN amplitudes to high-risk hit decisions above 16. This indicates that evaluative processes are already ongoing even as decisions are being made and that high-risk hit decisions are evaluated immediately. Because a hit decision above 16 strongly decreases the probability of winning as compared with a sit decision, an associated decrease in reward expectation (e.g., of about 20% for 17, see difference of sit versus hit in Table 1) would constitute a negative TD error. In addition, participants with larger response-locked ERN amplitudes to high-risk hit decisions tended to play more cautiously and thus to avoid high-risk hit decisions across the experiment.
The relation between ERN amplitude and decision making may be compared with recent findings in similar studies (Yasuda and others 2004; Yeung and Sanfey 2004). Our findings show that high-risk hit decisions above 16 lead to a relatively large response-locked ERN compared with hit decisions below 16 and that the amplitude of this response-locked ERN is related to more cautious decisions across the experiment. In contrast, Yeung and Sanfey (2004) reported on an increased likelihood of risky decisions subsequent to greater stimulus-locked ERN activity in preceding trials (similar to Yasuda and others 2004). This divergence may be due to the fact that feedback was contingent upon behavior in our study, whereas this was not the case in the Yeung and Sanfey (2004) study where the feedback was not informative for the participant with respect to a future decision. In particular, in the latter study, feedback was given pseudorandomly, and the participants won on 50% of the trials. Thus, a participant with increased subjective reward expectation (and according to RL theory with increased ERN to a loss) may be more likely to follow the gambler's fallacy and make more risky decisions because these decisions have a higher subjective outcome. Moreover, these risky decisions do not lead to systematic adverse consequences under pseudorandomized feedback. In contrast, in the present study, highly risky behavior lead to systematic negative consequences—in particular to a decrease in reward (r = −0.57)—and thus shaped behavior toward more cautious decisions. On the other hand, the feedback-locked and the response-locked ERN may be different phenomena (see also Gehring and Willoughby 2004), which may be an alternative explanation for the divergent results because our finding is based on response-locked ERN data and the finding by Yeung and Sanfey (2004) on feedback-locked ERN data.
Taken together, the above findings are in line with the suggestion that the ERN process is involved in RL, a process that is sensitive to the negative deviation from a reward expectation in response to errors, feedback, and monetary losses (e.g., Holroyd and Coles 2002; Holroyd and others 2002; Holroyd and others 2003; Nieuwenhuis, Holroyd, and others 2004). The above results further suggest that the ERN may be considered as a signature of the negative evaluation of an action or decision and may thus facilitate the avoidance of such actions or decisions in the future (for a similar finding, see Frank and others 2005).
However, a more detailed analysis of the ERPs to the presentation of a third or fourth card revealed that feedback-related ERN difference waves for total scores over 21 minus total scores below 22 for 2 levels of risk (low with 13 and 14 and medium with 15 and 16) were greater for bad outcomes after preceding medium-risk hit decisions as compared with low-risk hit decisions. This particular finding appears to contradict the RL theory because according to a mathematical estimate, the reward expectation for low-risk is 33% versus 25% for medium-risk hit decisions (see Table 1). Accordingly, the difference in reward expectation for a hit at 13/14 as compared with a hit at 15/16 is rather small and inverse to the TD hypothesis. Thus, an explanation solely based on the degree of reward expectation or the TD error seems to be insufficient.
One approach to explain the latter finding is to propose that the participants chose to hit with the intention (expectation) to stay below 22 with the additional card. On average, this intended result (staying below 22) is associated with higher reward expectation for medium-risk (67%, see Table 1) as compared with low-risk hit decisions (58%). Accordingly, the ERN to the additional card might reflect the TD error related to the decrease in the reward expectation that is based on the intention/goal of the participant. Thus, the deviation of an outcome from an intention would trigger the ERN, and the ERN amplitude would be proportional to the reward expectation associated with that intention. This interpretation suggests that the brain calculates a conditional probability of reward based on its intentions (calculating the probability of winning if staying below 22). The latter idea of an evaluation of an intended reward expectation by the ERN process is consistent with the proposal of Paus (2001), who suggested that the anterior cingulate participates in the translation of intentions into actions. Because the ACC has connections to brain areas responsible for drive and arousal, for motor output, and for cognition, it is in an ideal position to perform this translation (see Paus 2001). More specifically, the negative evaluation of an intended reward expectation may reduce the probability that the respective intention obtains access to action. RL theory is based on this functional view of the ACC (see Holroyd and Coles 2002). However, the idea that the ACC deals with reward expectation based on an intention deviates from the original RL model, which proposed that the ERN represents the computation of a more general TD error based on the subjective reward expectation of the current situation.
In terms of RL theory, an intention most closely resembles an action policy, which may be considered as a plan of action or more generally a plan of events (if in situation S use action A that leads to situation S′). The intentional view suggests that the focus of evaluation is the action policy rather than the average reward expectation of the current situation. Accordingly, the success of one's action policy is being evaluated and, if an intention/action policy fails, an ERN is elicited rather than if a reduction in current reward expectation is detected. In most cases, the 2 accounts may yield the same predictions because any action policy/intention is associated with a certain reward expectation. However, in certain cases, they may yield different predictions. For example, an action A is taken in order to lead to a situation S, but an opponent player causes that the action leads to situation S′. If the same reward expectation were associated with situations S and S′, the intentional view would suggest that the deviation from the intention or action policy is sufficient to elicit an ERN although reward expectation does not change. In contrast, the TD error view would not expect an ERN in such a situation. Thus, future research is needed to clarify whether a deviation from an intention is a sufficient, or a necessary condition for the elicitation of an ERN, or neither of these 2 alternatives.
The proposed view that a deviation of an outcome from an intention triggers the ERN would also imply a different explanation for some of the other findings of the present study. For example, the response-locked ERN to the hit decisions may be interpreted as a deviation from the intention of the participants not to hit above a certain score. Accordingly, the response-locked ERN to high-risk hits (as opposed to medium and low risk) may reflect the deviation from the policy of the participants not to hit above 16 because the high-risk hits are clearly beyond the risk threshold (mean = 15.68, SD = 0.60). This implies that the observed response-locked ERN would be the consequence of an “erroneous” decision or response. For the feedback-locked ERN following final feedback, the interpretation is rather similar to the initial TD error explanation. The reward expectation associated with the intention to win should be similar to the current reward expectation just before feedback. This intentional interpretation clearly deviates from the original RL theory of the ERN. While both share the idea that the amplitude of the ERN is proportional to the degree of a reward expectation, the intentional explanation proposes that a participant's unaccomplished intention or action policy triggers ERN.
Another interpretation might reconcile the above findings more directly with the RL theory of the ERN. According to Holroyd and Coles (2002), the ERN amplitude is proportional to the TD error multiplied by the eligibility trace. The eligibility trace is a parameter in learning theory and indicates that a population of neurons remains eligible for learning leading to behavioral change. Accordingly, it may be hypothesized that the eligibility trace parameter is greater after hit decisions with medium risk (of 15 and 16) as compared with low risk (13 and 14), whereas the difference in reward expectation is probably small. This hypothesis is corroborated by the observation that the average number of behavioral changes per participant (i.e., the number of changes from a hit to a sit decision and vice versa, reflecting an openness for learning) is 23.7 (SD = 6.3) for medium-risk trials (15 and 16) and only about 6.4 (SD = 6.1) for low-risk trials (13 and 14). Thus, greater behavioral change, which may reflect more openness to learning, is accompanied by greater ERN amplitudes in the medium-risk condition. Furthermore, a correlational analysis of interindividual differences examined whether specific relevant behavioral changes are related to the differences in the ERN waves between these conditions. There was a significant relationship between the behavioral change after a loss with a hit decision to a sit decision and the difference of the ERN amplitudes (r = −0.57, P = 0.007). Thus, participants who showed greater behavioral changes also showed a greater increase in the ERN amplitudes to bad outcomes after medium-risk hit decisions as compared with low-risk decisions. Another finding further corroborates the idea that increased ERN amplitudes might be associated with greater behavioral changes or an increased eligibility of neurons for learning. There were significant correlations between the relevant behavioral changes from sit to hit after losses with a sit decisions and the feedback-related ERN amplitudes for the difference wave at 15/16 (r = −0.45, P = 0.034) and for the difference wave at 17/18 (r = −0.53, P = 0.013, and not significant for 19/20, r = −0.17, P = 0.258). This finding reveals an association between increased feedback-related ERN amplitudes and stronger behavioral shifts. Because the ERN amplitudes are supposed to be proportional to both the eligibility trace and the TD error (or reward expectation), it may be argued that the differences between the 3 conditions of final scores are mainly due to differences in reward expectation, whereas the differences between participants reveal that the ERN may also be amplified by the eligibility trace. In particular, this is primarily the case for lower scores because behavioral changes for higher scores (19/20) were rarely observed.
Taken together, most of the findings of the present study are in line with RL theory of the ERN (e.g., Holroyd and Coles 2002; Holroyd and others 2002; Holroyd and others 2003; Nieuwenhuis, Holroyd, and others 2004). First, we found greater feedback ERN activty to losses as compared with wins. Second, this increase of feedback ERN activity was shown to be linearly related to the degree of reward expectation. Third, greater ERN activity was found for busts as compared with good outcomes for subsequent cards. Fourth, response-locked ERN was reported for high-risk hits, which is in line with an expected TD error for such hits, and fifth, the amplitude of the response-locked ERN was significantly related to more cautious behavior. However, one particular finding contradicts the proposal that the TD error is exclusively related to ERN amplitude. The feedback ERN was greater for bad outcomes after preceding medium-risk hit decisions as compared with low-risk hit decisions, which suggests that the RL theory may have to be amended. Two possibilities of revising RL theory of the ERN have been suggested in the present manuscript. First, the intention of the participant may be the key to the understanding of the ERN phenomena presented here. Accordingly, the ERN amplitude would be proportional to the reward expectation associated with the “unfulfilled” intention of the participant. Alternatively, the relevance for learning and future behavior may be a critical variable and may be implemented in the form of an eligibility trace parameter. Accordingly, ERN amplitudes are proposed to be proportional to reward expectation and to the degree of openness to learning of populations of neurons involved in the task.
In summary, our data indicate that both internal and external information is used to evaluate behavior in terms of its success or failure, and failure may lead to the avoidance of behaviors that precede it. Furthermore, the present data imply that decision making in a more complex game such as Blackjack might involve additional cognitive variables and the evaluation of several different kinds of reward expectation. Further research is necessary to explore this possibility. Recent studies have provided a mixed pattern of results regarding a TD error explanation of ERN findings (e.g., Holroyd, Nieuwenhuis, and others 2003; Hajcak, Holroyd, and others 2005; Hajcak, Moser, and others 2005; Holroyd and others 2006). Additional research will be necessary to examine the relations between different theoretical accounts of ERN phenomena (Holroyd and Coles 2002; Luu and others 2003; Yeung and others 2004), which might provide a more final assessment of the explanatory power of RL theory for ERN phenomena. In particular, the relation between behavioral change and ERN amplitude and the relation between intention and reward expectation in RL theory need further elaboration. Finally, our analysis of the behavior of the ERN in a realistic gambling paradigm suggests an avenue for the investigation of pathologic gambling and may further corroborate the role of dopamine in such behaviors (Comings and others 1996, 1999; Hollander and others 2000) because dopamine has been proposed to play an important role in the generation of the ERN (e.g., Holroyd and Coles 2002).
Special thanks to Prof. Dr Althöfer, Dr Stefan Schwarz, and colleagues from the Department of Mathematics of Friedrich-Schiller-University Jena for helpful discussions and support concerning a mathematical view of the game. Conflict of Interest: None declared.