Abstract

Learning progressively more abstract stimulus–response mappings requires progressively more anterior regions of the lateral frontal cortex. Using an individual differences approach, we studied subjects with frontal lesions performing a hierarchical reinforcement-learning task to investigate how frontal cortex contributes to abstract rule learning. We predicted that subjects with lesions of the left pre-premotor (pre-PMd) cortex, a region implicated in abstract rule learning, would demonstrate impaired acquisition of second-order, as opposed to first-order, rules. We found that 4 subjects with such lesions did indeed demonstrate a second-order rule-learning impairment, but that these subjects nonetheless performed better than subjects with other frontal lesions in a second-order rule condition. This finding resulted from both their restricted exploration of the feature space and the task structure of this condition, for which they identified partially representative first-order rules. Significantly, across all subjects, suboptimal but above-chance performance in this condition correlated with increasing disconnection of left pre-PMd from the putative functional hierarchy, defined by reduced functional connectivity between left pre-PMd and adjacent nodes. These findings support the theory that activity within lateral frontal cortex shapes the search for relevant stimulus–response mappings, while emphasizing that the behavioral correlate of impairments depends critically on task structure.

Introduction

The capacity to adapt rapidly and flexibly to novel circumstances represents a fundamental feature of higher cognitive function. As shown by multiple investigators, this ability to abstract—for example, to discover higher order relationships (Robin and Holyoak 1995), to chunk lower level items (Chase and Simon 1973; Newell 1990), and to analogize/transfer information to new situations (Gick and Holyoak 1980,1983)—depends critically on the frontal and prefrontal cortex (Koechlin et al. 2003; Bunge 2004; Petrides 2006; Bor and Owen 2007; Christoff and Keramatian 2007; Koechlin and Summerfield 2007; Badre 2008; Badre and D'Esposito 2009). In addition, multiple theoretical and empirical accounts have suggested that more anterior regions of the frontal lobe support more abstract representations (reviewed in Badre 2008), though these accounts differ in important ways. One theory emerging from the study of working memory argues that increasingly anterior regions of prefrontal cortex represent domain-general, as opposed to domain-specific, information (Christoff and Gabrieli 2000; Buckner 2003; Courtney 2004). In this formulation, more caudal regions of the frontal cortex maintain the location of an object across a delay, for example, while more rostral regions maintain both location and object identity. Another theory argues that increasingly rostral areas of prefrontal cortex represent greater relational complexity (Robin and Holyoak 1995; Christoff et al. 2001; Christoff and Keramatian 2007). Under this formulation, the number of relationships required to generate a response—that is, whether the response depends upon the color of an object (one relation) or the match between colors of different objects (two relations)—determines the locus of brain activity (Christoff and Keramatian 2007; and discussed in Badre 2008). Importantly, neither of these theories necessitates that processing in lower order brain regions is dependent upon processing in higher order regions, or vice versa.

Recently, a number of studies have suggested that our capacity to apply rules to new situations, and to identify new rules based on current circumstances, may be supported by a specifically hierarchical organization of lateral frontal cortex—that is, by an organization in which progressively more anterior regions process progressively more abstract representations, and in which superordinate frontal regions modulate responses in subordinate ones (Koechlin et al. 2003; Koechlin and Jubault 2006; Badre and D'Esposito 2007; Badre et al. 2009). Support for this idea (discussed further below) has come from studies based on policy abstraction, a form of abstraction in which first-order stimulus–response mappings are potentially contingent upon more abstract second-order (and higher order) rules (Badre and D'Esposito 2007; Badre et al. 2009). Notably, in these theories, interactions between representations at different levels of abstraction are critical.

To evaluate whether a policy abstraction account might explain how we learn abstract action rules, we recently designed a novel reinforcement-learning task (Badre et al. 2010). In this task, participants were required to learn 2 sets of rules, in separate epochs, that linked each of 18 different stimuli uniquely and deterministically to 1 of 3 button-press responses (Fig. 1). For each rule set, an individual stimulus consisted of 1 of 3 shapes, at 1 of 3 orientations, inside a box that was 1 of 2 colors for a total of 18 unique stimuli (3 shapes × 3 orientations × 2 colors). Participants initially learned the correct one of the 3 possible button-press responses, for each stimulus, based on trial and error responding (Fig. 1A). For 1 of the 2 rule sets—the “Flat” set—each of the 18 rules had to be learned individually as one-to-one mappings (first-order policy) between a conjunction of color, shape, and orientation and a response (Fig. 1B,D). In the other set—the “Hierarchical” set—stimulus display parameters and instructions were identical to those for the Flat set. In fact, the Hierarchical set could also be learned as 18 individual first-order rules. However, the stimulus–response mappings were defined such that a second-order relationship could be learned instead, thereby, reducing the number of first order rules to be learned (Fig. 1C). Specifically, in the context of 1 colored box, only the shape dimension was relevant to the response, with each of the 3 unique shapes mapping to one of the 3 button responses irrespective of orientation. In contrast, in the context of the other colored box, the orientation dimension fully determined the response. Thus, the Hierarchical rule set permitted learning of abstract, second-order rules mapping color-to-feature along with 2 sets of first-order rules (i.e., specific shape-to-response and orientation-to-response mappings; Fig. 1E). Critically, this simplifying second-order structure did not permit subjects to simply ignore one of the stimulus features. For example, learning only the first-order mapping of orientation to response in the Hierarchical case would fully identify one half of the stimulus space—that in which only orientation determined the response—but not the other half of the stimulus space, in which only shape determined the response.

Figure 1.

Schematic depiction of trial events, example stimulus-to-response mappings, and policy for Hierarchical and Flat rule sets. (A) Trials began with presentation of the stimulus for 5 s, during which subjects could respond with a button press at any point. Immediately after stimulus presentation, participants received auditory feedback indicating whether the response they had chosen was correct given the presented stimulus. Trials were separated by a variable intertrial interval with a mean of 1.5 s. (B) Example stimulus-to-response mappings for the Flat set. The arrangement of mappings for the Flat set was such that no higher order relationship was present; thus, each rule had to be learned individually. (C) Example stimulus-to-response mappings for the Hierarchical set. Response mappings are grouped such that in the presence of a red square, only shape determines the response, while in the presence of a blue square only orientation determines the response. (D) The Flat set of many first-order rules can be represented as a large flat policy structure with only 1 level and 18 alternatives. (E) The Hierarchical set can be represented as a 2-level policy structure with a second-order rule selecting between the shape or orientation mapping sets and a set of first-order rules linking specific shapes or orientations to responses.

Figure 1.

Schematic depiction of trial events, example stimulus-to-response mappings, and policy for Hierarchical and Flat rule sets. (A) Trials began with presentation of the stimulus for 5 s, during which subjects could respond with a button press at any point. Immediately after stimulus presentation, participants received auditory feedback indicating whether the response they had chosen was correct given the presented stimulus. Trials were separated by a variable intertrial interval with a mean of 1.5 s. (B) Example stimulus-to-response mappings for the Flat set. The arrangement of mappings for the Flat set was such that no higher order relationship was present; thus, each rule had to be learned individually. (C) Example stimulus-to-response mappings for the Hierarchical set. Response mappings are grouped such that in the presence of a red square, only shape determines the response, while in the presence of a blue square only orientation determines the response. (D) The Flat set of many first-order rules can be represented as a large flat policy structure with only 1 level and 18 alternatives. (E) The Hierarchical set can be represented as a 2-level policy structure with a second-order rule selecting between the shape or orientation mapping sets and a set of first-order rules linking specific shapes or orientations to responses.

In this work (Badre et al. 2010), we demonstrated that previously identified first- and second-order cortical regions in the left lateral frontal cortex are associated with learning first- and second-order stimulus–response mappings (left dorsal premotor [PMd] and left dorsal pre-premotor [pre-PMd] cortex; Picard and Strick 2001), respectively. Along with higher order mappings associated with even more anterior regions—activations within inferior frontal sulcus and a region within frontopolar cortex, for example, have been associated with third- and fourth-order mappings, respectively (Badre and D'Esposito 2007)—these regions define an anatomical hierarchical ordering of first- through fourth-order representations. However, arguing that this hierarchical organization is important for abstract rule acquisition, and that it might provide a more complete explanation for the gradient of abstraction in frontal cortex than other theories, would benefit greatly from an approach that addresses disruption of the system. For this reason, here we investigate the effects of lesions in relevant cortices on neural function.

If a hierarchical organization impacts second-order rule identification, at least 3 specific predictions follow. First, disruption of left pre-PMd, but not closely adjacent cortical areas, should disrupt acquisition of second-order rules. Second, because this area is enmeshed in a hierarchical structure, the degree to which this region is disconnected from other (i.e., adjacent first- and third-order) nodes in the hierarchy—that is, its residual intrinsic connectivity within the hierarchy—should correlate with task performance. Neither of these predictions is required for other theories of prefrontal cortical (PFC) performance, as these theories either specify different (or broader) cortical regions, or do not necessarily depend upon functional connections with other areas. Third, the dependence on these functional connections should itself vary with the structure/level of abstraction of the task. To address the above hypotheses about this putative lateral frontal hierarchy, we followed an individual differences approach to test subjects with frontal lobe lesions on our hierarchical reinforcement-learning task.

Materials and Methods

Participants

Eighteen English-speaking subjects (mean age 61.7 ± 9.3 years, range 44–75 years) with single lesions due to ischemic stroke (n = 11), intracerebral hemorrhage (n = 3), traumatic brain injury (n = 3), or tumor resection (n = 1) were studied (Table 1 and Supplementary Information). All visible lesions were limited to the frontal lobe. Subjects were at least 1.5 years postevent (mean 11.3 ± 9.0 years; range 1.5–34 years) and were prescreened to exclude individuals with a history of other neurological or psychiatric conditions. In particular, stroke patients with a history of cardioembolic stroke were selected in order to minimize the contribution of known atherosclerotic cerebrovascular disease to the neuroimaging data. A neuropsychological battery was administered to all subjects (see Supplementary Information). One subject with a left frontal lesion had a right face and arm hemiparesis that required him to respond with the left rather than the right hand. Written informed consent was obtained from subjects in accordance with procedures approved by the Committee for Protection of Human Subjects at the University of California, Berkeley.

Table 1

Demographic information for each of the 18 subjects

Subject number Age Education (years) Lesion site Lesion size (cc) Time since onset (years) Etiology 
61 14 R lateral PFC 49 Ischemic stroke 
75 13 R lateral PFC 105 4.5 Ischemic stroke 
60 14 R basal ganglia Hemorrhagic stroke 
73 14 L basal ganglia 15 Hemorrhagic stroke 
68 14 L basal ganglia Hemorrhagic stroke 
56 B OFC (R > L) 137 34 Trauma 
63 13 L rostromedial PFC 65 Ischemic stroke 
60 16 B OFC 247 16 Tumor resection 
72 12 R inferolateral PFC 20 10 Ischemic stroke 
10 44 16 B OFC 18 1.5 Trauma 
11 67 16 R lateral OFC 12 33 Trauma 
12 63 15 L lateral PFC 92 3.5 Ischemic stroke 
13 58 18 L lateral PFC 62 Ischemic stroke 
14 65 11 L lateral PFC 116 13 Ischemic stroke 
15 75 16 L lateral PFC 147 10.5 Ischemic stroke 
16 51 20 L lateral PFC 150 12 Ischemic stroke 
17 47 18 L lateral PFC 122 7.5 Ischemic stroke 
18 52 20 L lateral PFC 237 10.5 Ischemic stroke 
Subject number Age Education (years) Lesion site Lesion size (cc) Time since onset (years) Etiology 
61 14 R lateral PFC 49 Ischemic stroke 
75 13 R lateral PFC 105 4.5 Ischemic stroke 
60 14 R basal ganglia Hemorrhagic stroke 
73 14 L basal ganglia 15 Hemorrhagic stroke 
68 14 L basal ganglia Hemorrhagic stroke 
56 B OFC (R > L) 137 34 Trauma 
63 13 L rostromedial PFC 65 Ischemic stroke 
60 16 B OFC 247 16 Tumor resection 
72 12 R inferolateral PFC 20 10 Ischemic stroke 
10 44 16 B OFC 18 1.5 Trauma 
11 67 16 R lateral OFC 12 33 Trauma 
12 63 15 L lateral PFC 92 3.5 Ischemic stroke 
13 58 18 L lateral PFC 62 Ischemic stroke 
14 65 11 L lateral PFC 116 13 Ischemic stroke 
15 75 16 L lateral PFC 147 10.5 Ischemic stroke 
16 51 20 L lateral PFC 150 12 Ischemic stroke 
17 47 18 L lateral PFC 122 7.5 Ischemic stroke 
18 52 20 L lateral PFC 237 10.5 Ischemic stroke 

Note: R, right; L, left; B, bilateral; OFC, orbitofrontal cortex.

Logic and Design

In order to investigate the discovery of abstract rules, we used a reinforcement-learning task that required the learning of 2 rule sets, one of which contained a higher order rule structure (Hierarchical rule set) and one that could only be learned as one-to-one mappings between stimuli and responses (Flat rule set; both rule sets shown in Fig. 1). Participants were not given an indication through an instruction or any other cue that a higher order structure existed in one of the rule sets. Moreover, trials for both rule sets were identical in terms of all stimulus presentation parameters, instructions, and response-reward contingencies.

Each rule set was learned over the course of 360 individual learning trials. Each trial commenced with the presentation of a stimulus display consisting of a nonsense object (i.e., without a real-world counterpart) appearing in 1 of 3 orientations (up [0°], left [−90°], or oblique [23°]) and bordered by a colored square. For each rule set, 2 colors, 3 object shapes, and 3 orientations were used, giving rise to 18 unique stimulus displays (i.e., 3 shapes × 3 orientations × 2 colors). Each of the 18 unique displays occurred 20 times for each rule set (Hierarchical and Flat). The specific colors and shapes differed across the 2 rule sets within subject and were counterbalanced for rule set across subjects.

The object and square appeared together for 5 s and were then replaced by a white fixation cross for a variable intertrial interval (mean of 1.5 s, range 0–8 s). While the stimulus display was present, the participant could respond with 1 of 3 buttons using the index, middle, or ring fingers of the right hand. (For one subject with significant right-hand weakness, the left hand was used.) Once a response was made or 6 s had passed without a response, the green fixation cross turned red and no further responding was allowed. A lack of response was scored as an incorrect trial. Subjects then received auditory feedback: a high tone (750 Hz) indicated a correct response and a buzzing tone (combination of 300 and 400 Hz pure tones) indicated an incorrect response. A running total of correct responses was displayed at the end of each run of 60 trials. The order of trials within a block was determined in pseudorandom fashion, as described in our previous study (Badre et al. 2010), and the order of rule set learning (i.e., whether Hierarchical or Flat was learned first) was counterbalanced across participants.

For both rule sets, participants were given the same instructions. No indication was given that a higher order relationship existed or that they should search for an abstract rule. Participants did not practice the task but they were allowed to fully familiarize themselves with all 18 stimuli they would encounter for a given rule set prior to conducting the learning trials for that rule set. As a result, the 2 rule sets differed only in the arrangement of mappings between stimulus displays and responses (Fig. 1B–E).

Behavioral Analysis

Learning curves were calculated using a state-space modeling procedure (Smith et al. 2004) that estimates the probability of a correct response on each trial as a function of a latent Gaussian state process (i.e., the state of knowledge of the subject) and an observable Bernoulli response process (i.e., the responses of the subject). In other words, the model uses the learner's trial-by-trial responses (either correct or incorrect) to estimate his/her knowledge about the task over time. In contrast with “sliding average” or other methods of computing learning curves, this approach allows one to define a confidence interval associated with the estimate of learning on each trial. Thus, this method produces a “learning trial,” or the trial at which the confidence interval no longer encompasses chance performance. Because this method estimates a single value for the variance of the Gaussian state process across learning, it does not incorporate details of the task or make assumptions about hierarchical learning (for further details, see Smith et al. 2004). Learning curves using this procedure were calculated for each subject for the entire rule set, as well as for each of the 18 individual rules based on the 20 presentations of a particular stimulus.

We focused our behavioral analysis on 2 measures of learning for both Hierarchical and Flat sessions: 1) the terminal accuracy (i.e., the probability of a correct response on the final trial), which is related to the degree of learning at the conclusion of each of the Hierarchical and Flat conditions and 2) the learning trial for each of the 18 stimulus–response mappings in each of the Hierarchical and Flat conditions—that is, that trial, if any, at which there was a 95% or greater probability that responding for a given mapping was different from chance performance. Individual object-response mappings were considered to be learned if a learning trial could be defined, or if the number of actual correct responses for an object deviated significantly from the expected number of correct responses for chance responding, based on Bernoulli assumptions (>13 correct responses of the 20 presentations of each object; P < 0.05, Bonferroni corrected for the number of objects). We also compared the number of learned objects for each colored square, in both Hierarchical and Flat sessions, in order to determine whether learning was specific to a subset of feature combinations. Terminal accuracy values were variance stabilized via an arcsine square root transform prior to statistical analyses. All parametric tests were performed under the assumptions of potentially unequal numbers and variances, with the Welch–Satterthwaite approximation used to estimate the degrees of freedom. As a consequence, the degrees of freedom varied from test to test, despite unchanging numbers of subjects.

Magnetic Resonance Imaging Acquisition Procedures

T2*-weighted echo planar images (EPIs) were collected on a whole body 3-T Siemens MAGNETOM Trio magnetic resonance imaging (MRI) scanner using a 12-channel head coil. Structural images were acquired using an axial MP-RAGE 3D T1-weighted sequence (time repetition [TR] = 2300 ms, time echo [TE] = 2.98 ms, flip angle = 9°, voxel size = 1 mm3) and a fluid attenuated inversion recovery image to assist with lesion visualization, as in our previous work (Nomura et al. 2010). Resting state images consisted of 28 slices acquired with a gradient echoplanar imaging protocol (300 time points for each of 2 runs, TR = 2000 ms, TE = 30 ms, field of view = 225 mm, matrix size = 128 × 128, voxel size = 1.75 × 1.75 × 3.3 mm) for each of the 18 subjects. Prior to resting state scans, participants were instructed simply to remain awake with their eyes open. All scans were obtained at least 6 months after the index event for each subject.

MRI Preprocessing and Lesion Definition

In keeping with our previous work (Nomura et al. 2010), the software package AFNI (Analysis of Functional NeuroImages) was used for slice timing correction, image realignment, and removal of nonbrain structures from the EPI volumes prior to spatial smoothing with a 6-mm full-width at half-maximum Gaussian kernel. The high-resolution T1-weighted image was co-registered with the mean functional data and segmented using SPM5 (Wellcome Department of Cognitive Neurology, London) via a template derived from 152 normal subjects (MNI152; Montreal Neurological Institute, Montreal, Quebec, Canada). All analyses were then performed on the native-space functional images. The extra segmentation step was necessary for accurate registration of images demonstrating structural brain damage. To address the effect of subjects' lesions on regions of interest (ROIs) implicated in the learning and application of abstract rules, we calculated the percentage of voxels in each ROI that overlapped with the lesion mask (Fig. 2). Lesion masks were constructed in our previous study (Nomura et al. 2010), with all individual subject masks shown in normalized space in Supplementary Figure S1.

Figure 2.

(A) Locations of ROIs. (B) The cumulative lesion burden across all 18 subjects. The number of subjects with overlapping lesion locations is indicated by the color bar at bottom. (C) The percentage of voxels in each ROI affected by the single lesions in each of the 18 subjects. Note that the lesions in subjects 9–11 did not involve any of the prespecified ROIs. (Please see Supplementary Figure S1 for individual subject lesions.)

Figure 2.

(A) Locations of ROIs. (B) The cumulative lesion burden across all 18 subjects. The number of subjects with overlapping lesion locations is indicated by the color bar at bottom. (C) The percentage of voxels in each ROI affected by the single lesions in each of the 18 subjects. Note that the lesions in subjects 9–11 did not involve any of the prespecified ROIs. (Please see Supplementary Figure S1 for individual subject lesions.)

Functional connectivity

Twelve ROIs were derived from our previous work (Badre et al. 2010), including 4 left lateral frontal cortical areas representing first- through fourth-order levels of policy abstraction—dorsal premotor cortex (PMd), pre-PMd cortex, the inferior frontal sulcus (IFS), and frontopolar cortex, respectively—as well as bilateral caudate and putamen ROIs. Activity within these basal ganglia ROIs was noted to share Granger causal influences with both PMd and pre-PMd in our previous study (Badre et al. 2010). Given the potential importance of cortico-striato-thalamic loops in cognitive processing (Alexander et al. 1986; Houk and Wise 1995; Graybiel 1998), we included these basal ganglia ROIs in our analyses. To derive right-sided frontal ROIs, we chose areas homologous to the left-sided regions by simply inverting the x-coordinate for each left-sided ROI. These 12 ROIs (Fig. 2A) were then reverse normalized to each subject's native space, utilizing the normalization parameters obtained from the SPM5 segmentation tool.

After preprocessing, each time series for the two 5-min resting-state runs was windowed with a 4-point split-cosine bell and concatenated with the other segment to produce a subject-specific 600 time-point series for every voxel in the brain. Time series within each ROI were then averaged across voxels to generate a single time series for each ROI. Coherency values were obtained by applying a fast Fourier transform (Matlab 6.5, http://www.mathworks.com) to the data for each pair of ROIs, implemented via Welch's periodogram averaging method using a 64-point discrete Fourier transform, Hanning window, and overlap of 32 points (Kayser et al. 2009). Coherence values for each ROI were then computed using the band-averaged coherence. To compute correlations between coherence results and other values, we first Fisher transformed the coherence values to generate an approximately normal distribution (Rosenberg et al. 1989) that permitted us to apply parametric statistical tests.

Results

We tested 18 subjects with brain lesions, involving the frontal cortex and/or basal ganglia and potentially affecting 12 ROIs implicated in the learning and execution of hierarchical rules (Fig. 2, Supplementary Figures S1 and S2; see also Materials and Methods). In keeping with our previous data and with other results from both control and subject populations that the left lateral prefrontal cortex may be implicated in rule processing (Goel and Dolan 2004; Reverberi et al. 2009), subjects were ordered by lesion location, such that low subject numbers were associated with right-predominant lesions of pre-PMd cortex and high numbers with left-predominant lesions. Four subjects demonstrated lesions that significantly involved left pre-PMd (Fig. 2C, subjects 15–18). Both neuropsychological testing and demographic variables were assessed (see Supplementary Materials).

As demonstrated by their learning curve trajectories, none of the subjects reached perfect performance in either the Hierarchical or the Flat condition (Fig. 3). Notably, across the group of subjects, there were no significant differences between terminal accuracies or the learning trial in the Hierarchical and Flat conditions (Ps > 0.22), suggesting that subjects did not uncover a second-order rule in the Hierarchical condition (Badre et al. 2010). To investigate whether performance in the Hierarchical condition was differentially affected by lesion location, we evaluated the Hierarchical and Flat conditions in 4 subjects with complete or near-complete lesions of the second-order region, left pre-PMd. As evidenced by Fig. 4A, there was a strongly significant effect of lesion location on differential learning in the Hierarchical versus Flat cases (F1,14 = 22.7, P = 0.0003). Despite failing to learn the full second-order rule space, left pre-PMd subjects showed significantly better differential accuracy than subjects with other frontal lesions: Hierarchical − Flat difference = 0.10 versus 0.00, T7 = 5.35, P = 0.001. (Importantly, this result remained significant if group membership was weighted by the extent of left pPMd involvement by the lesion, thereby incorporating the minor influence of subjects 13–14: F1,14 = 17.8, P = 0.0009; weighted differences 0.08 versus 0.00, T6 = 4.5, P = 0.002.) The differences between terminal accuracies in the Hierarchical and Flat conditions for these subjects were driven primarily by differences in terminal accuracy for the Hierarchical (second-order) rule set (left pre-PMd group = 0.62; other group = 0.41; T7 = 5.3, P = 0.001; Fig. 4B). There were also concordant between-group differences for these values, though only at trend significance, for the Flat rule (left pre-PMd group = 0.30; other group = 0.42; T6 = −2.0, P = 0.09; Fig. 3C). Thus, despite failing to learn the second-order rule structure, subjects with left pre-PMd lesions performed significantly better in the Hierarchical condition than subjects with lesions elsewhere, even closely adjacent ones.

Figure 3.

Learning curves for each of the (color coded) 18 subjects across the 360 trials for the Hierarchical (left) and Flat (right) rule sets. A probability correct of 0.33 represents chance performance. The probability correct after the last trial is defined as the terminal accuracy.

Figure 3.

Learning curves for each of the (color coded) 18 subjects across the 360 trials for the Hierarchical (left) and Flat (right) rule sets. A probability correct of 0.33 represents chance performance. The probability correct after the last trial is defined as the terminal accuracy.

Figure 4.

(A) The difference between the terminal accuracies in the Hierarchical and Flat conditions for each of the 18 subjects. Those subjects with complete or near-complete lesions of left pre-PMd cortex (left pre-PMd; subjects #15–18) are highlighted by the gray shading. The box-whisker plot to the right summarizes these differences for the left pre-PMd and other-lesion groups. (B) Terminal accuracies for the subjects in the Hierarchical condition (top) and Flat condition (bottom). *indicates P ≤ 0.001; ∼ indicates 0.05 < P < 0.10.

Figure 4.

(A) The difference between the terminal accuracies in the Hierarchical and Flat conditions for each of the 18 subjects. Those subjects with complete or near-complete lesions of left pre-PMd cortex (left pre-PMd; subjects #15–18) are highlighted by the gray shading. The box-whisker plot to the right summarizes these differences for the left pre-PMd and other-lesion groups. (B) Terminal accuracies for the subjects in the Hierarchical condition (top) and Flat condition (bottom). *indicates P ≤ 0.001; ∼ indicates 0.05 < P < 0.10.

To understand why subjects with left pre-PMd lesions performed better than other subjects in the Hierarchical than Flat condition, we compared the number of individual stimulus–response mappings successfully learned in subjects with and without left pre-PMD lesions. Across both the Hierarchical and Flat conditions, the left pre-PMd and other lesion groups learned equal numbers of stimulus–response mappings (11 vs. 9.1, T5 = 0.74, P = 0.5 [not significant, ns]). However, there were trends for left pre-PMd subjects to learn more Hierarchical rules (8.3 vs. 4.2, T5 = 2.4, P = 0.06) and fewer Flat rules (2.8 vs. 4.9, T10 = −2.0, P = 0.07) than subjects with other lesions. Importantly, the mappings learned by left pre-PMd subjects in the Hierarchical case were not divided equally across the “shape” and “orientation” rules that together comprised the second-order condition; rather, subjects learned one, but not both, of these mappings. The absolute difference between number of “shape” mappings and number of “orientation” mappings learned in the Hierarchical case for subjects with left pre-PMd lesions was both significantly greater than in the Flat case (5.25 vs. 1.25, T3 = 6.9, P = 0.006) and significantly greater than the performance of the other lesion group in either rule condition (both Ps < 10−5; Fig. 5A). As this finding suggests, and as evidenced by the left pre-PMd subject with median performance (#15, Fig. 5B), these subjects learned significantly more than the other-lesion subjects, but on only a restricted portion of the rule space: that is, learning for 3 (1) subjects occurred on the “shape” (“orientation”) rule and not on the “orientation” (“shape”) rule.

Figure 5.

(A) The difference between the number of learned rules associated with one colored square versus the other (in the Hierarchical condition, representing the “shape” and “orientation” rules). The left pre-PMd group shows asymmetric learning across colored squares in the Hierarchical condition only (*P < 0.01; **P < 0.0001). (B) Number of correct responses for each of the 18 stimuli during the Hierarchical condition for subject #15, who showed median terminal accuracy within the left pPMd group. Learned rules are indicated by the black shading, while unlearned rules are indicated by white shading. The first 9 stimuli are part of the “shape” rule, while the second 9 are part of the “orientation” rule.

Figure 5.

(A) The difference between the number of learned rules associated with one colored square versus the other (in the Hierarchical condition, representing the “shape” and “orientation” rules). The left pre-PMd group shows asymmetric learning across colored squares in the Hierarchical condition only (*P < 0.01; **P < 0.0001). (B) Number of correct responses for each of the 18 stimuli during the Hierarchical condition for subject #15, who showed median terminal accuracy within the left pPMd group. Learned rules are indicated by the black shading, while unlearned rules are indicated by white shading. The first 9 stimuli are part of the “shape” rule, while the second 9 are part of the “orientation” rule.

To understand the neural correlates of these performance differences, we evaluated resting state functional connectivity between left pre-PMd and other nodes implicated in policy abstraction (Badre and D'Esposito 2007; Badre et al. 2010). We hypothesized that search restricted to a portion of the rule space—that is, search in which second-order rules were not evaluated—should be correlated with disconnection of the second-order region from the hierarchy—that is, decreased baseline connectivity between left pre-PMd and superordinate (left IFS) and subordinate (left PMd) nodes. Consistent with this hypothesis, a significant inverse relationship could be seen between terminal accuracy in the Hierarchical condition and both left pre-PMd ↔ left IFS connectivity (R = −0.69, P = 0.001; Fig. 6A, left) and left pre-PMd ↔ left PMd connectivity (R = −0.52, P = 0.03; Fig. 6A, right). In other words, decreased connectivity between the second-order region (pre-PMd) and both first- and third-order areas was correlated with improved performance on a subset of the rule space in the Hierarchical condition. Significantly, neither relationship to connectivity held for the Flat condition (R = 0.00, ns and R = −0.25, ns; Fig. 6B)—that is, in the Flat case, the task structure does not reward search that neglects one feature.

Figure 6.

(A) The correlation of terminal accuracy in the Hierarchical condition with resting-state coherence between left pre-PMd and left IFS (left panel) and between left pre-PMd and left premotor cortex (PMd; right panel). Subjects in the left pre-PMd group are indicated by dark gray circles; subjects in the other-lesion group are indicated by light gray circles. Both correlations remain significant if left pre-PMd subjects are excluded. (B) The correlation of terminal accuracy in the Flat condition with resting-state coherence between these same ROIs. Neither correlation value is significant.

Figure 6.

(A) The correlation of terminal accuracy in the Hierarchical condition with resting-state coherence between left pre-PMd and left IFS (left panel) and between left pre-PMd and left premotor cortex (PMd; right panel). Subjects in the left pre-PMd group are indicated by dark gray circles; subjects in the other-lesion group are indicated by light gray circles. Both correlations remain significant if left pre-PMd subjects are excluded. (B) The correlation of terminal accuracy in the Flat condition with resting-state coherence between these same ROIs. Neither correlation value is significant.

Importantly, the above correlation results were not confined only to the 4 subjects with left pre-PMd lesions. A concern might be that these correlations arose because subjects known to perform better in the Hierarchical condition had lesions in the ROI, and therefore such significant correlations resulted from the performance of these subjects alone. However, when subjects #15–18 were removed from the calculation (dark gray circles, Fig. 6), both correlations remained strongly significant (R = −0.71, P = 0.005 and R = −0.66, P = 0.01, respectively). Moreover, when partial correlations were taken with respect to demographic variables—age, education, and lesion size—for all subjects, these correlations remained (R = −0.73, P = 0.002 and R = −0.54, P = 0.04, respectively). Similar correlations were not seen with respect to performance in the Flat condition (all Ps > 0.32; Fig. 6B). Importantly, these correlations between connectivity and behavior also captured the performance of subjects #5 and #7 (left caudate lesions; Fig. 2C), who performed differentially well in the Hierarchical rule case relative to the Flat case (Fig. 4A). Thus, the inverse correlation between reduced left pre-PMd connectivity and Hierarchical performance held across all subjects, not just those with left pre-PMd lesions.

Discussion

Our capacity for generalizing previously learned behaviors to changed circumstances, a hallmark of intelligent behavior, is critical to our day-to-day ability to navigate the world. Its importance is demonstrated by behavioral changes in subjects with lesions of the lateral prefrontal cortex, following which cognition is often described as more concrete, perseverative, and/or stimulus bound (Devinsky and D'Esposito 2004). Our recent work investigating the neural basis for such adaptive behavior in healthy individuals demonstrated that the ability to learn an abstract rule structure correlates with activity in hierarchically organized lateral frontal regions: specifically, with activity in the left dorsal pre-PMd cortex (left pre-PMd) for the acquisition of a second-order rule structure (Badre et al. 2010). Here, we extend these findings to subjects with frontal lesions, and show that, while no subject learned the full rule structure when a more abstract rule was available, 4 subjects with lesions of left pre-PMd performed better in the Hierarchical condition than those with frontal lesions elsewhere. Consistent with the hierarchy hypothesis, learning for left pre-PMd subjects was restricted to a portion of the rule space in the Hierarchical condition. Moreover, across “all” subjects with lesions, the degree to which left pre-PMD was disconnected from the hierarchy—that is, from the brain regions functionally above (IFS) and below (PMd) it—correlated with improved performance in the Hierarchical task only.

Other possible explanations do not account as well for our data. These results, for example, are not simply a consequence of lesions that affect the left hemisphere, irrespective of hierarchy. Subjects 12–14, for example, demonstrate significant left-sided lesions (Supplementary Figure S1), but did not show the behavior of patients 15–18, whose left-sided lesions involved most or all of left pre-PMd. Another question concerns the role of lesion extent, as the lesions involving left pre-PMd also involved the third-order region in the left inferior frontal sulcus (Fig. 2). We cannot exclude that the lesion of IFS is also critical to this behavior, but note that functional connectivity between regions outside of IFS (i.e., between pre-PMd and PMd) was also negatively correlated with performance (discussed further below). While this change could arise secondary to the change in coherence between IFS and pre-PMd, a more parsimonious explanation is that both coherence values involving pre-PMd (i.e., PMd–pre-PMd and pre-PMd–IFS) correlate with behavior because of the disconnection of one area: pre-PMd.

By what cognitive mechanism, then, do lesions to left pre-PMd but not other frontal regions lead to suboptimal, but relatively advantageous, performance? An attractive idea is that higher order hierarchical regions have an important role in resolving competition in lower order regions (Badre and D'Esposito 2007, 2009; Badre et al. 2009), allowing different lower order stimulus–response relationships to be active at different times (Bunge 2004). If a higher order region is no longer capable of resolving this competition, the ability to flexibly employ different lower order rules would be lost. In a learning experiment, lesions of left pre-PMd could thereby limit the rule search space to a subset of possible rules driven by function in the (intact) first-order region (PMd).

However, this reduction in the search space of possible stimulus–response mappings was clearly helpful in patients with pre-PMd lesions, permitting suboptimal but above-chance performance on the task. In theory, the reduction in search space is potentially quite significant; given the 18 unique stimuli and 3 responses, the number of possible combinations of S–R mappings is quite large: 318 = 387 420 489. While a number of these mappings are unlikely—for example, ones in which all stimuli map to the same response or to unbalanced numbers of the different responses—patients who effectively neglected one of the features (i.e., who selected combinations of color and shape or combinations of color and orientation, rather than combinations of all 3 features) would reduce the number of effective stimuli to 6, rather than 18, and decrease the search space to 36 = 729 combinations. Given further assumptions about a relatively equal distribution of button-press responses, for example (which all subjects expressed), a much smaller space could be searched; and the strategy, while not optimal, would be relatively successful in the Hierarchical task (Frank and Badre 2011) given that one-half of the S–R mappings in this rule set effectively “ignore” one feature. Consistent with this explanation, such a strategy would be quite ineffective in the Flat condition, where all 3 features are important.

In support of these ideas, the above behavioral changes were inversely correlated with a neurophysiological measure of left pre-PMd connectivity across all subjects. Specifically, coherence between left pre-PMd and super- (IFS) and subordinate (PMd) cortical areas within the policy abstraction hierarchy was lower in subjects who learned more accurately. Thus, even in subjects without lesions of left pre-PMd, increasing disconnection of this second-order region—accompanied by a corresponding restriction in the search space of second-order rules—improved performance only in the Hierarchical condition. This finding highlights the specificity of these results for the “function” of left pre-PMd, whether the functional deficit is due to an overt lesion or not, and supports the notion of a hierarchical relationship between these regions. As noted in the Results, for example, 2 subjects with lesions of the left caudate (subjects #5 and #7) demonstrated similar patterns of performance as did the left pre-PMd subjects and had similar levels of disconnection of left pre-PMd. Anatomically, this region of the caudate was shown to be functionally connected to the left pre-PMd in our previous study (Badre et al. 2010), suggesting that the disconnection in these subjects may be related to dysfunction in the cortico-striato-thalamic loops linking them (Alexander et al. 1986). These subjects thus reinforce the idea that dysfunction in brain regions is the ultimate determinant of behavior, and that this dysfunction can be mediated via injury to remote anatomical sites—that is, via diaschisis (Monakow 1914; Finger et al. 2004; Nomura et al. 2010). Moreover, this finding only held for the relevant task structure—that is, the Hierarchical case.

In this context, a final prediction of our model is that reduced functional connectivity between left pre-PMd and super-/subordinate lateral cortical regions should only correlate inversely with terminal accuracy when performance is suboptimal. In other words, the tendency to explore a reduced portion of the rule space should correlate inversely with engagement of left pre-PMd. However, this same argument suggests that learning the “full” rule space should be positively, not negatively, correlated with functional connectivity of left pre-PMd, as seen by others in the execution of higher order rules (Koechlin et al. 2003; Kouneiher et al. 2009). To address this question, we reanalyzed task-related functional MRI data from our previous study (Badre et al. 2010), in which many healthy subjects successfully learned the full rule structure. As predicted, in this case, greater connection of left pre-PMd within the hierarchy during task performance showed a positive, not negative, correlation with terminal accuracy at trend significance at the beginning of learning (r = 0.39, P = 0.087) that reached significance by the end of learning (r = 0.49, P = 0.03; Fig. 7A). As expected, this relationship did not hold for the Flat condition (Fig. 7B).

Figure 7.

(A) The correlation of terminal accuracy in the Hierarchical condition with task-state coherence between left pre-PMd and left IFS during both early learning (i.e., during the first third of trials—left panel) and later learning (i.e., during the last third of trials—right panel) for healthy subjects from our previous study (Badre et al. 2010). (B) The correlation of terminal accuracy with functional connectivity between these same regions in the Flat condition. Neither correlation value is significant.

Figure 7.

(A) The correlation of terminal accuracy in the Hierarchical condition with task-state coherence between left pre-PMd and left IFS during both early learning (i.e., during the first third of trials—left panel) and later learning (i.e., during the last third of trials—right panel) for healthy subjects from our previous study (Badre et al. 2010). (B) The correlation of terminal accuracy with functional connectivity between these same regions in the Flat condition. Neither correlation value is significant.

One final question concerns the relatively poor performance of the subjects with lesions outside left pre-PMd. Given the variable nature of the lesions in other subjects, there are likely to be multiple explanations. Consistent with theories in which there are multiple behavioral “controllers” subject to reinforcement learning, for example (Doya et al. 2002; Holroyd and Coles 2002; Frank and Badre 2011), it has been suggested that motivational processes in hierarchically-organized regions of the medial prefrontal cortex “energize” processing in corresponding areas of the lateral prefrontal cortex (Kouneiher et al. 2009). Additionally, striatal regions are thought to acquire first-order stimulus–response associations over longer time periods (Houk and Wise 1995; O'Reilly et al. 2007; Grahn et al. 2009), and reward-related striatal and ventral prefrontal regions may mediate learning from unexpected outcomes (Schoenbaum et al. 2009; Takahashi et al. 2009). Subjects with lesions affecting these areas may therefore be doubly disadvantaged in their efforts to search the S–R space, in that they confront the full search space with impaired search mechanisms. More broadly, these findings reinforce the idea that lesions in other sites can disrupt component processes supported by the networks in which the left pre-PMd is embedded. While left pre-PMd may be preferentially engaged in second-order learning, it is “specialized” for this function only in the context of a network of active brain regions. Further work to test lesion subgroups will clearly be important to better define how these other lesions impact both learning and network behavior.

In summary, these results are consistent with the existence of a rostrocaudal hierarchical organization within lateral frontal cortex that supports learning at various levels of abstraction. In keeping with our previous work (Badre et al. 2009; Badre et al. 2010), lesions of the left pre-PMd, implicated in the discovery of second-order rule structures, disrupt learning of second-order rules. Importantly, the nature of this disruption depends critically on task structure. Because the Hierarchical condition includes 2 simpler rules that ignore one feature (shape and orientation, respectively), this disruption improves performance in subjects relative to those with lesions elsewhere in the frontal cortex and basal ganglia, although overall learning remains impaired. In addition to operationalizing concepts that subjects with such lesions can be more concrete or stimulus bound (Devinsky and D'Esposito 2004), these findings suggest that under some conditions, a restricted rule space can be relatively advantageous. More generally, they emphasize that the impact of lesions on behavior can be modified by task structure and context—a concept ultimately important in using this knowledge to advance rehabilitation efforts for these and similar subjects.

Supplementary Material

Supplementary material can be found at: http://www.cercor.oxfordjournals.org/

Funding

The State of California to A.S.K.; the National Institutes of Health (grants MH63901 and NS40813 to M.D.).

We thank J. Hoffman for assistance with data collection, D. Badre and M. Frank for helpful discussions of preliminary results, E. Nomura and C. Gratton for access to resting state data, R. Knight, J. Black, and D. Scabini for assistance with subject recruitment, and the subjects for their participation. Conflict of Interest: None declared.

References

Alexander
GE
DeLong
MR
Strick
PL
Parallel organization of functionally segregated circuits linking basal ganglia and cortex
Annu Rev Neurosci
 , 
1986
, vol. 
9
 (pg. 
357
-
381
)
Badre
D
Cognitive control, hierarchy, and the rostro-caudal organization of the frontal lobes
Trends Cogn Sci
 , 
2008
, vol. 
12
 (pg. 
193
-
200
)
Badre
D
D'Esposito
M
Functional magnetic resonance imaging evidence for a hierarchical organization of the prefrontal cortex
J Cogn Neurosci
 , 
2007
, vol. 
19
 (pg. 
2082
-
2099
)
Badre
D
D'Esposito
M
Is the rostro-caudal axis of the frontal lobe hierarchical?
Nat Rev Neurosci
 , 
2009
, vol. 
10
 (pg. 
659
-
669
)
Badre
D
Hoffman
J
Cooney
JW
D'Esposito
M
Hierarchical cognitive control deficits following damage to the human frontal lobe
Nat Neurosci
 , 
2009
, vol. 
12
 (pg. 
515
-
522
)
Badre
D
Kayser
AS
D'Esposito
M
Frontal cortex and the discovery of abstract action rules
Neuron
 , 
2010
, vol. 
66
 (pg. 
315
-
326
)
Bor
D
Owen
AM
A common prefrontal-parietal network for mnemonic and mathematical recoding strategies within working memory
Cereb Cortex
 , 
2007
, vol. 
17
 (pg. 
778
-
786
)
Buckner
RL
Functional-anatomic correlates of control processes in memory
J Neurosci
 , 
2003
, vol. 
23
 (pg. 
3999
-
4004
)
Bunge
SA
How we use rules to select actions: a review of evidence from cognitive neuroscience
Cogn Affect Behav Neurosci
 , 
2004
, vol. 
4
 (pg. 
564
-
579
)
Chase
WG
Simon
HA
Perception in chess
Cognitive Psychology
 , 
1973
, vol. 
4
 (pg. 
55
-
81
)
Christoff
K
Gabrieli
JDE
The frontopolar cortex and human cognition: evidence for a rostrocaudal hierarchical organization within the human prefrontal cortex
Psychobiology
 , 
2000
, vol. 
28
 (pg. 
168
-
186
)
Christoff
K
Keramatian
K
Bunge
SA
Wallis
JD
Abstraction of mental representations: theoretical considerations and neuroscientific evidence
Perspectives on rule-guided behavior
 , 
2007
New York
Oxford University Press
Christoff
K
Prabhakaran
V
Dorfman
J
Zhao
Z
Kroger
JK
Holyoak
KJ
Gabrieli
JD
Rostrolateral prefrontal cortex involvement in relational integration during reasoning
Neuroimage
 , 
2001
, vol. 
14
 (pg. 
1136
-
1149
)
Courtney
SM
Attention and cognitive control as emergent properties of information representation in working memory
Cogn Affect Behav Neurosci
 , 
2004
, vol. 
4
 (pg. 
501
-
516
)
Devinsky
O
D'Esposito
M
Neurology of cognitive and behavioral disorders
 , 
2004
New York
Oxford University Press
Doya
K
Samejima
K
Katagiri
K
Kawato
M
Multiple model-based reinforcement learning
Neural Comput
 , 
2002
, vol. 
14
 (pg. 
1347
-
1369
)
Finger
S
Koehler
PJ
Jagella
C
The Monakow concept of diaschisis: origins and perspectives
Arch Neurol
 , 
2004
, vol. 
61
 (pg. 
283
-
288
)
Frank
MJ
Badre
D
Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis
Cereb Cortex. PMID: 21693490.
 , 
2011
Gick
ML
Holyoak
KJ
Analogical problem solving
Cognitive Psychology
 , 
1980
, vol. 
12
 (pg. 
306
-
355
)
Gick
ML
Holyoak
KJ
Schema induction and analogical transfer
Cognitive Psychology
 , 
1983
, vol. 
15
 (pg. 
1
-
38
)
Goel
V
Dolan
RJ
Differential involvement of left prefrontal cortex in inductive and deductive reasoning
Cognition
 , 
2004
, vol. 
93
 (pg. 
B109
-
B121
)
Grahn
JA
Parkinson
JA
Owen
AM
The role of the basal ganglia in learning and memory: neuropsychological studies
Behav Brain Res
 , 
2009
, vol. 
199
 (pg. 
53
-
60
)
Graybiel
AM
The basal ganglia and chunking of action repertoires
Neurobiol Learn Mem
 , 
1998
, vol. 
70
 (pg. 
119
-
136
)
Holroyd
CB
Coles
MG
The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity
Psychol Rev
 , 
2002
, vol. 
109
 (pg. 
679
-
709
)
Houk
JC
Wise
SP
Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action
Cereb Cortex
 , 
1995
, vol. 
5
 (pg. 
95
-
110
)
Kayser
AS
Sun
FT
D'Esposito
M
A comparison of Granger causality and coherency in fMRI-based analysis of the motor system
Hum Brain Mapp
 , 
2009
, vol. 
30
 (pg. 
3475
-
3494
)
Koechlin
E
Jubault
T
Broca's area and the hierarchical organization of human behavior
Neuron
 , 
2006
, vol. 
50
 (pg. 
963
-
974
)
Koechlin
E
Ody
C
Kouneiher
F
The architecture of cognitive control in the human prefrontal cortex
Science
 , 
2003
, vol. 
302
 (pg. 
1181
-
1185
)
Koechlin
E
Summerfield
C
An information theoretical approach to prefrontal executive function
Trends Cogn Sci
 , 
2007
, vol. 
11
 (pg. 
229
-
235
)
Kouneiher
F
Charron
S
Koechlin
E
Motivation and cognitive control in the human prefrontal cortex
Nat Neurosci
 , 
2009
, vol. 
12
 (pg. 
939
-
945
)
Monakow
Cv
Die lokalisation im grosshirn und der abbau der funktion durch kortikale herde
 , 
1914
Wiesbaden (Germany)
J. F. Bergmann
Newell
A
Unified theories of cognition
 , 
1990
Cambridge (MA)
Harvard University Press
Nomura
EM
Gratton
C
Visser
RM
Kayser
A
Perez
F
D'Esposito
M
Double dissociation of two cognitive control networks in patients with focal brain lesions
Proc Natl Acad Sci U S A
 , 
2010
, vol. 
107
 (pg. 
12017
-
12022
)
O'Reilly
RC
Frank
MJ
Hazy
TE
Watz
B
PVLV: the primary value and learned value Pavlovian learning algorithm
Behav Neurosci
 , 
2007
, vol. 
121
 (pg. 
31
-
49
)
Petrides
M
Dehaene
S
Duhamel
J-R
Hauser
MD
Rizzolatti
G
The rostro-caudal axis of cognitive control processing within lateral frontal cortex
From monkey brain to human brain: a Fyssen foundation symposium
 , 
2006
Cambridge (MA)
MIT Press
(pg. 
293
-
314
)
Picard
N
Strick
PL
Imaging the premotor areas
Curr Opin Neurobiol
 , 
2001
, vol. 
11
 (pg. 
663
-
672
)
Reverberi
C
Shallice
T
D'Agostini
S
Skrap
M
Bonatti
LL
Cortical bases of elementary deductive reasoning: inference, memory, and metadeduction
Neuropsychologia
 , 
2009
, vol. 
47
 (pg. 
1107
-
1116
)
Robin
N
Holyoak
KJ
Gazzaniga
MS
Relational complexity and the functions of prefrontal cortex
The cognitive neurosciences
 , 
1995
Cambridge (MA)
MIT Press
(pg. 
987
-
997
)
Rosenberg
JR
Amjad
AM
Breeze
P
Brillinger
DR
Halliday
DM
The Fourier approach to the identification of functional coupling between neuronal spike trains
Prog Biophys Mol Biol
 , 
1989
, vol. 
53
 (pg. 
1
-
31
)
Schoenbaum
G
Roesch
MR
Stalnaker
TA
Takahashi
YK
A new perspective on the role of the orbitofrontal cortex in adaptive behaviour
Nat Rev Neurosci
 , 
2009
, vol. 
10
 (pg. 
885
-
892
)
Smith
AC
Frank
LM
Wirth
S
Yanike
M
Hu
D
Kubota
Y
Graybiel
AM
Suzuki
WA
Brown
EN
Dynamic analysis of learning in behavioral experiments
J Neurosci
 , 
2004
, vol. 
24
 (pg. 
447
-
461
)
Takahashi
YK
Roesch
MR
Stalnaker
TA
Haney
RZ
Calu
DJ
Taylor
AR
Burke
KA
Schoenbaum
G
The orbitofrontal cortex and ventral tegmental area are necessary for learning from unexpected outcomes
Neuron
 , 
2009
, vol. 
62
 (pg. 
269
-
280
)