For behaviour to be purposeful, it is important to monitor the preceding behavioural context, particularly for factors regarding stimulus, response and outcome. The dorsolateral prefrontal cortex (DLPFC) appears to play a major role in such a context-dependent, flexible behavioural control system, and this area is likely to have a neuronal mechanism for such retrospective coding, which associates response-outcome with the information and/or neural systems that guided the response. To address this hypothesis, we recorded neuronal activity from the DLPFC of monkeys performing memory- and sensory-guided saccade tasks, each of which had two conditions with reward contingencies. We found that post-response activity of a subset of DLPFC neurons was modulated by three factors relating to earlier events: the direction of the immediately preceding response, its outcome (reward or non-reward) and the information type (memory or sensory) that guided the response. Such neuronal coding should play a role in associating response-outcome with information and/or neural systems used to guide behaviour — that is, ‘retrospective monitoring’ of behavioural context and/or neural systems used for guiding behaviour — thereby contributing to context-dependent, flexible control of behaviours.
Our ability to modify our behaviour is largely due to our ability to select appropriate responses according to a given situation or context. In primates, the dorsolateral prefrontal cortex (DLPFC) has long been thought to play a critical role in this ability (Duncan, 2001; Miller and Cohen, 2001; Tanji and Hoshi, 2001), and the cellular basis of this cognitive ability has been studied in non-human primates. For example, a subset of DLPFC neurons shows differential responses to an identical visual stimulus depending on the aspects that must be attended to (Sakagami and Niki, 1994; Rainer et al., 1998; Sakagami and Tsutsui, 1999; White and Wise, 1999; Hoshi et al., 2000) or on the motor response associated with that stimulus (Asaad et al., 1998; Wise and Murray, 2000). More recently, Miller and colleagues have shown that some DLPFC neurons encode on-going tasks (Asaad et al., 2000) or abstract rules (Wallis et al., 2001; Wallis and Miller, 2003a) rather than the stimulus identity itself. These findings indicate that a fraction of the neurons in the DLPFC can modulate their activity according to on-going conditions or rules to guide goal-directed behaviour appropriate to the given context.
In addition to the coding of on-going situations, it is important in flexible behavioural control to associate prior events or context, such as one's own prior action or response, its outcome and the stimulus or information that guided the action (Herrnstein, 1961; MacArthur and Pianka, 1966; Platt, 2002). The DLPFC is likely to be involved in such ‘retrospective monitoring and coding’ of prior events. Indeed, we recently reported that a subset of neurons in the DLPFC represent an association between an immediately preceding motor response and its outcome (reward or non-reward) (Tsujimoto and Sawaguchi, 2004a). Furthermore, Barraclough et al. (2004) reported that activity of some DLPFC neurons, during a given trial in a free-choice task, is modulated according to the location of monkeys' choice and its outcome (reward or non-reward) in the previous trial. Furthermore, several lines of research have reported that this area is critical for performance of the Wisconsin Card Sorting Test (Milner, 1963; Barcelo and Knight, 2002), which requires the subjects to monitor and integrate their own prior choice and its outcome to select the next appropriate response. Thus, the DLPFC is likely to have a neuronal mechanism responsible for associating a preceding response-outcome with the context that guided the response. However, little is known about the neuronal basis of retrospective association between three critical factors, namely stimulus, response and outcome, although this is an important step in developing our understanding of the neuronal systems of the DLPFC in cognitive control of behaviour.
In the present study, we hypothesized that a subset of neurons in the DLPFC code response-outcome based on the nature of information that has been used to guide that response. To test this hypothesis, we adopted two saccade tasks — a memory-guided saccade (MGS-R) task and a visually guided saccade (VGS-R) task — each of which had two conditions with regard to reward contingency. The MGS-R task was a variant of the conventional oculomotor delayed-response task, which requires the subject to make a memory-guided saccade to a memorized location (Funahashi et al., 1989; Funahashi and Kubota, 1994). In the current version, the correct motor response was rewarded in half of the trials only, and the subjects could not expect the outcome (i.e. immediate reward or delayed reward) (Tsujimoto and Sawaguchi, 2004a). The VGS-R task had the same temporal sequence as the MGS-R task, but the target remained on during the delay and response periods; hence, the response was guided by sensory, rather than mnemonic, information. We report here that post-response activity of a subset of DLPFC neurons was modulated by three factors: the direction of the immediately preceding saccade, its outcome (immediate reward or delayed reward) and the information (sensory or memory) used to guide the response.
Materials and Methods
Subjects and Task Procedures
Two male macaque monkeys (Macaca fuscata, ∼6.5 and ∼5.0 kg, named SN and SZ, respectively) were used. Throughout this study, the subjects were treated in accordance with the ‘Guide for Care and Use of Laboratory Animals’ of the National Institutes of Health and the present experimental protocols were approved by the Animal Care and Use Committee of Hokkaido University School of Medicine. In the first week, the monkeys were habituated to a monkey chair and the experimental booth. Preliminary surgery was then performed under deep pentobarbital sodium anaesthesia (∼25 mg/kg i.v.) and aseptic conditions. The skull was partly exposed and two head-holding devices [stainless steel PIPES, 8 mm inside diameter (ID)] were implanted on the anterior and posterior portions of the skull with dental acrylic. For grounding, small stainless-steel bolts (3 mm diameter) were anchored to the skull and fixed with dental acrylic. To prevent infection, antibiotics were injected intramuscularly on the day of surgery and daily for one week thereafter.
After recovery from the preliminary surgery, the subjects were trained to perform a modified memory-guided saccade (MGS-R) task with two different reward conditions: immediate reward (IM-Rw) and delayed reward (DL-Rw) (Fig. 1A). In both conditions, a trial commenced when the monkey fixated on a central spot (a white square, 0.5° × 0.5°) on the CRT monitor. After 1.5 s, a cue (a white square, 0.5° × 0.5°) appeared at one of the six symmetric peripheral locations (eccentricity 15°, Fig. 1B) for 0.5 s. After a delay period of 3 s, the fixation spot turned off, which instructed the monkey to make a memory-guided saccade to the cued location. When the eye movement fell inside a target window of 5° from the cue position, the cue reappeared as a target of second fixation; in the IM-Rw condition only, a drop of water (∼0.1 ml) was delivered into the mouth of the monkey 500 ms later. In the DL-Rw condition, the click sound of the solenoid valve, which was used for reward delivery, was presented without delivery of the water. After the monkey fixated on the peripheral target for 2 s (i.e. 2.5 s from the end of the saccade), the same amount of reward (∼0.1 ml) was delivered in both conditions. The target remained for another 1 s, and the monkey fixated on it, although this third fixation was not rewarded in either condition. Throughout the trial, the eye position was restricted to within 5° of the central fixation point or peripheral target and if the monkey broke fixation, the trial was aborted. The two reward conditions were randomly intermixed so that the monkey could not expect whether his saccade would be rewarded immediately (i.e. 500 ms) after the saccade. The monkeys were first trained on a conventional memory-guided saccade task where they were rewarded immediately after every correct saccade. After that, the two conditions were introduced simultaneously and then the duration of the F2-period was gradually extended to 2.5 s.
After sufficient training on the MGS-R task, we introduced another task, a modified version of delayed visually guided saccade (VGS-R) task (Fig. 1A). The VGS-R task had the same temporal sequence and reward conditions (IM-Rw, and DL-Rw) as the MGS-R task, but the cue remained in position after its initial appearance during the delay and response periods; hence, the subjects' saccade was guided by sensory (visual) stimulus.
The two tasks were interleaved block-wise, with each block lasting 30–60 trials, so that the subjects were instructed implicitly about the on-going task and the information to be used. To confirm the reproducibility of the neuron's behaviour, each task block was repeated at least twice when isolating the activity of single neurons. The monkeys were sufficiently overtrained in the final version of the tasks so that the order of training would be unrelated to differences in neuronal activity between the two conditions and the two tasks. At the end of the training session and throughout the recording sessions, the performance of both monkeys was almost perfect (>95% correct responses).
The tasks and the recordings were controlled by a system consisting of an infrared eye-camera system (R-21C-A, RMS, Hirosaki, Japan), two personal computers (PC9801 FA and BA39, NEC, Tokyo) and other associated peripheral equipment. The eye-camera system was connected to the personal computers via A/D converters and was used for monitoring and sampling eye positions. The two personal computers were networked using RS232C and parallel I/O. One of the computers controlled the tasks, while the other monitored and collected the data for neuronal activities, eye positions and task events.
After training was completed, surgery for recording was performed. Under pentobarbital sodium anaesthesia (∼25 mg/kg, i.v.) and aseptic conditions, an oblong opening was made in the skull over the frontal cortex and the dura was exposed. A stainless steel chamber (20 × 40 mm, rectangle) was then implanted with dental acrylic. Prophylactic antibiotics were injected intramuscularly on the day of surgery and daily for one week thereafter.
The activity of single neurons was recorded with custom-made glass-insulated elgiloy microelectrodes (0.5–1.5 MΩ), using conventional electrophysiological techniques similar to those described in our previous studies (Iba and Sawaguchi, 2002; Tsujimoto and Sawaguchi, 2004a,b). A plastic grid with numerous small holes (0.7 mm ID, 1.5 mm apart) was attached to the cylinder to provide a coordinate frame for vertical electrode penetration (Crist et al., 1988). The electrode was advanced vertically to the cortical surface using a pulse motor-driven micromanipulator (MO-81, Narishige, Tokyo). We did not pre-screen neurons for task-related responses. Rather, we advanced the electrode until the activity of one or more neurons was well isolated and then commenced data recording. A window discriminator (DDIS-1, BAK Electronics, Germantown MD) digitized neuronal activity. An A/D converter digitized the data for task events and eye positions. These digitized data were stored on a RAM disk in a data-collection computer and then transferred to magneto-optical diskettes for off-line analysis. Furthermore, all analogue data (i.e. neuronal activity, eye positions, and task sequence) were recorded on digital audiotape (DAT) using an eight-channel DAT recorder (PC-208 M, Sony, Tokyo).
We focused on neurons in the DLPFC rostral to the frontal eye field (FEF) (Fig. 1C). To estimate the FEF physiologically, we applied intracortical microstimulation (ICMS; 22 cathodal pulses of 0.3 ms duration at 333 Hz, up to 100 μA) through the recording electrodes. When eye movements were elicited by the ICMS, the site was considered to be within the FEF (Bruce et al., 1985) and data recorded from these sites were excluded from this study.
For one monkey (monkey SZ), the recording sites were also identified with histological experiments. The second monkey (monkey SN) is still alive and is participating in another study. The monkey SZ was deeply anaesthetized with an overdose of pentobarbital sodium and perfused with 0.9% saline, followed by 10% formalin. The brain was removed and photographed, and then the cortical surface was examined to detect the penetration points.
In this study, we were interested in the effects of a preceding response-outcome on neuronal processes during the post-response period. Therefore, we focused on neuronal activity during the F2-period and applied a two-factor analysis of variance (ANOVA), separately for each task condition, to examine the effects of saccade direction and its outcome (immediate reward or delayed reward) on the discharge rate during the F2-period. Neurons either with significant main effects in both factors or with two-way interaction were the focus of the study (P < 0.05), because both the preceding response and its outcome influenced their F2-period activity. The onset of this activity was defined as the end of the first bin in which the firing rate differed from the average firing rate by >2 SD during the control period of 1 s, immediately preceding the cue onset. To test whether these neurons were accompanied by the directional cue- and/or delay-period activity, activities during the cue and delay period were examined using a one-way ANOVA on each task separately, after collapsing the data regarding the reward condition.
To examine selectivity quantitatively for saccade direction and reward condition in F2-period activity, we calculated a directional selectivity index (DSI) and a condition selectivity index (CSI) (Moody et al., 1998; Tsujimoto and Sawaguchi, 2004b). DSI was calculated as follows:
We recorded activity from a total of 302 neurons in the DLPFC (both hemispheres for monkey SN and left hemisphere for monkey SZ) while the monkeys performed the MGS-R and VGS-R tasks under the two conditions (monkey SN, n = 207; monkey SZ, n = 95). The activity during the post-response F2-period (500 to 2500 ms after target onset) was examined. Approximately half of the recorded neurons (n = 169, 56%) showed either a significant main effect in at least one of the two factors or a two-way interaction, as revealed by a two-way ANOVA (Direction × Reward Condition, P < 0.05), in at least one task condition. Sixty-nine (41%; monkey SN, n = 51; monkey SZ, n = 18) of these neurons showed either a significant main effect in both factors or an interaction between two factors; i.e. they were influenced by both the direction of preceding saccade and its outcome (immediate reward or delayed reward), and hence coded response-outcome (Tsujimoto and Sawaguchi, 2004a). The proportion of these neurons relative to all the recorded neurons did not differ significantly between two monkeys (SN versus SZ, 25 versus 19 %; χ2 = 0.53, P > 0.1). We focused here on these 69 neurons. They were classified into three groups: ‘MGS-specific neurons’ (n = 39, 57%), ‘VGS-specific neurons’ (n = 18, 26%) and ‘non-specific neurons’ (n = 12, 17%). The MGS-specific neurons showed F2-period activity coding response-outcome in the MGS-R task only, whilst the VGS-R neurons showed F2-period activity in the VGS-R task only. ‘Non-specific neurons’ showed such activity in both MGS-R and VGS-R tasks.
Some of these neurons (18/69, 26%) also showed directional activity during the cue and/or delay periods (n = 9 for cue only, n = 4 for delay only, n = 5 for both cue and delay). As summarized in Table 1, cue-period activity in most neurons of this population was not influenced by the task condition, whereas almost all this population of neurons exhibited directional delay-period activity either in the VGS task only or in both tasks. However, the present study focused mainly on the activity during the post-response F2-period, because of the following reasons: (i) the sample size of neurons that were accompanied by cue- and/or delay-period activity was too small to analyse adequately; and (ii) our purpose was to examine whether the neuronal activity coding response-outcome is influenced by the information that guided the response.
|MGS task only||0||1||0||1||1||0||0||1|
|VGS task only||1||0||0||1||2||1||0||3|
|MGS task only||0||1||0||1||1||0||0||1|
|VGS task only||1||0||0||1||2||1||0||3|
F2-period Activity of Representative Neurons
Figure 2 shows the F2-period activity of an MGS-specific neuron. In this figure, raster displays and averaged histograms of the activity of a single neuron are illustrated separately for the two conditions and the six directions. In the MGS-R task (Fig. 2A), this neuron showed a clear phasic increase in firing rate during the F2-period in the DL-Rw condition (i.e. after click tone without reward delivery), particularly in 120° and 180° trials, whereas this neuron showed little change in activity in the IM-Rw condition (i.e. after reward delivery). According to the two-way ANOVA, the mean discharge rate during the F2-period was significantly different both between two conditions [F(1,170) = 13.76, P < 0.001] and across six directions [F(5,170) = 2.27, P < 0.05]. On the other hand, in the VGS-R task (Fig. 2B), the F2-period activity of this neuron did not show a significant difference in discharge rate between either the two conditions [F(1,169) = 0.10, P > 0.1] or among six directions [F(5,169) = 0.91, P > 0.1]. Thus, this neuron's F2-period activity was modulated not only by the direction and reward condition but also by the task condition.
To examine the reproducibility for the F2-period activity of the neuron in Fig. 2 across different task blocks, we illustrated the activity for the direction with maximum activity (i.e. 120°) in the preferred condition (i.e. DL-Rw condition) separately and sequentially according to each task condition (Fig. 3A). In the first block (VGS-R task), this neuron showed little change in activity during the F2-period. However, after the task condition was altered to the MGS-R task, this neuron showed phasic activation following the click sound without reward delivery, even though neither the saccade direction nor the reward condition altered in this block change. This neuron showed little change in activity when the block was changed again, and eventually this neuron continued its phasic activation during the F2-period across three different blocks of the MGS-R task.
The F2-period activity of the neuron illustrated in Fig. 2 appeared to show spatial tuning in the DL-Rw condition of the MGS-R task. To quantify this, we applied Gaussian function-fitting curves to the mean discharge rates during the F2-period for each direction separately for the four conditions (Fig. 3B). As shown in Figure 3B, this neuron showed clear spatial tuning only for the DL-Rw condition of the MGS-R task (Td = 59°, best direction = 118°). In the other three conditions, this neuron did not show a clear difference in F2-period activity across six directions and clear spatial tunings were not observed.
Figure 4 illustrates the F2-period activity of a VGS-specific neuron. This neuron showed increased activity during the F2-period in the IM-Rw condition of the VGS-R task only, especially for the upper left (120°) trials. According to the ANOVA, the F2-period activity of this neuron differed significantly both between whether or not the response was rewarded immediately [F(1,151) = 40.24, P < 0.001] and across six directions of the immediately preceding saccade [F(5,151) = 2.54, P < 0.05] in the VGS-R task. In the MGS-R task, however, the F2-period activity of this neuron did not show a significant difference between the two reward conditions [F(1,164) = 3.04, P > 0.05] nor across six directions [F(5,164) = 0.92, P > 0.05].
Figure 5 illustrates the mean discharge rate during the F2-period of the VGS-specific neuron shown in Figure 4, with Gaussian function-fitting curves. As expected, the F2-period activity of this neuron showed clear spatial tuning in only one of the four conditions; i.e. the IM-Rw condition of the VGS-R task (Td = 45°, best direction = 95°).
An example of the F2-period activity of non-specific neurons is shown in Figure 6A. For the lower right (300°) trials (Fig. 6A, upper row), this neuron showed clear increases in activity during the F2-period in the DL-Rw condition in both MGS-R and VGS-R tasks. By contrast, for trials with the opposite direction (120°) (Fig. 6A, lower row), this neuron did not show any clear change in activity during the F2-period in all four conditions. The two-way ANOVA revealed significant main effects for both the condition and direction factors in both the MGS-R and VGS-R tasks [MGS-R task, for the factor of condition, F(1,113) = 12.37, P < 0.001, for the factor of direction, F(5,113) = 6.31, P < 0.001; VGS-R task, for the factor of condition, F(1,116) = 25.11, P < 0.001 and for the factor of direction, F(5,116) = 13.17, P < 0.001].
Spatial tunings of the F2-period activity of this neuron are shown in Figure 6B. The F2-period activity of this neuron showed clear spatial tuning in the DL-Rw condition in both the MGS-R and VGS-R tasks, but not in the IM-Rw condition in both tasks. Properties of the spatial tuning of this neuron (i.e. tuning width and direction) were quite similar between the MGS-R and VGS-R tasks (for the MGS-R task, Td = 52°, best direction = 277°; for the VGS-R task, Td = 51°, best direction = 275°).
Overall Activity of MGS-specific, VGS-specific and Non-specific Neurons
To quantitatively compare selectivity for direction and reward conditions across three groups of neurons, we calculated a directional selectivity index (DSI) and a condition selectivity index (CSI) for each neuron (see Materials and Methods). These indices could range from 0 to 1, with larger values indicating higher selectivity for direction or condition. As summarized in Table 2, all three groups showed similar DSI values (Wilcoxon signed-rank test, P > 0.1), indicating that the F2-period activity of the three groups of neurons had similar discriminability for the direction of the immediately preceding saccade. In particular, non-specific neurons showed quite similar directional selectivity between the MGS-R and VGS-R tasks, compatible with the similar patterns of population-level activity during the F2-period, as shown in Figure 7C.
|MGS-specific (n = 39)||VGS-specific (n = 18)||Non-specific: MGS-R (n = 12)||Non-specific: VGS-R (n = 12)|
|DSI||0.41 ± 0.10||0.37 ± 0.13||0.35 ± 0.09||0.35 ± 0.08|
|CSI||0.32 ± 0.13*||0.24 ± 0.13||0.28 ± 0.08||0.27 ± 0.12|
|Onset (ms)||558 ± 375||519 ± 327||437 ± 372||475 ± 393|
|MGS-specific (n = 39)||VGS-specific (n = 18)||Non-specific: MGS-R (n = 12)||Non-specific: VGS-R (n = 12)|
|DSI||0.41 ± 0.10||0.37 ± 0.13||0.35 ± 0.09||0.35 ± 0.08|
|CSI||0.32 ± 0.13*||0.24 ± 0.13||0.28 ± 0.08||0.27 ± 0.12|
|Onset (ms)||558 ± 375||519 ± 327||437 ± 372||475 ± 393|
Values represent mean ± SD for each condition.
P < 0.05.
CSI values are summarized in Table 2. MGS-specific neurons showed significantly higher CSI values than other groups (Mann–Whitney U-test, P < 0.05), indicating that the F2-period activity of MGS-specific neurons discriminated between reward and non-reward more highly than that of either VGS-specific or non-specific neurons. Again, non-specific neurons showed similar discriminability for the reward condition between the MGS-R and VGS-R tasks (Mann–Whitney U-test, P > 0.1).
To examine further the properties of the three groups of neurons, we calculated the onset latency of the F2-period activity for the preferred direction and reward condition (see Materials and Methods). As summarized in Table 2, our sample of neurons began to activate ∼500 ms after the onset of F2-period (i.e. 1st reward or click tone). We did not detect the statistically significant differences in onset latencies across neuronal groups, although the onset of non-specific neurons tended to be faster than of task-specific neurons.
Overall, we did not detect any clear clusters of these three indices (DSI, CSI, and onset) within each neuronal group. Therefore, to examine further the activity patterns of each group of neurons, we summed neuronal activity for the direction with maximum F2-period activity separately for each group and made population histograms (Fig. 7). At the population level, MGS-specific neurons showed significant increases in activity during the F2-period in the preferred condition of the MGS-R task only, while they did not show any clear change in activity during the F2-period in the non-preferred condition of the MGS-R task and notably in both conditions in the VGS-R task (Fig. 7A). Similarly, VGS-specific neurons showed significantly different F2-period activity according to the reward condition in the VGS-R task only (Fig. 7B). A population of non-specific neurons showed distinct F2-period activity according to the reward condition, but not to the task condition; the time course and discharge rate were quite similar between the two tasks (Fig. 7C). It is notable that all three groups of neurons exhibited little change in activity before the saccadic responses (i.e. delay period).
The present study examined post-response neuronal activity in the DLPFC of monkeys performing the MGS-R and VGS-R tasks under two conditions regarding reward contingency. The two tasks were different in terms of the information (i.e. memory or sensory) that guided the response. We showed three groups of neurons whose post-response F2-period activity was influenced by both the preceding response and its outcome. Interestingly, the F2-period activity of a subset of these neurons was also modulated by the information type (memory or sensory) used to guide the response. These findings are the first to show that single individual neurons in the DLPFC encode three factors relating to earlier events: the direction of the immediately preceding response, its outcome and the information type that guided the response, i.e. stimulus–response–outcome. This mechanism would play a role in ‘retrospective monitoring’, rather than on-going monitoring, of behavioural context and/or neural systems used to guide the behaviour.
Functional Implication of Non-specific Neurons
The F2-period activity of the present non-specific neurons was influenced by both the preceding response and its outcome, but not by the task condition. Since the properties of the F2-period activity of these neurons, including its time course and spatial tuning, were quite similar between the two tasks, these neurons may play a role in associating the preceding behavioural response and its outcome irrespective of the information used to guide the response. Alternatively, the directional selectivity of the F2-period activity of these neurons may reflect the position of the gaze rather than the direction of the saccade. However, although further studies are required to dissociate the effects of eye position and saccade direction for the F2-period activity of the present non-specific neurons, our previous study with the MGS-R task showed that the directional selectivity of the F2-period activity with reward dependency appears to be associated not with the position of the gaze but with the direction of the saccade (Tsujimoto and Sawaguchi, 2004a). Together with this previous finding, our non-specific neurons should provide further evidence that a fraction of DLPFC neurons code outcome (reward or non-reward) associated with the preceding directional response (Tsujimoto and Sawaguchi, 2004a).
Recently, Barraclough et al. (2004) reported that some neurons in the DLPFC modulate their activity according to the subject's choice and/or its outcome in the previous trial during an oculomotor free-choice task. The authors suggest that the DLPFC is involved in updating the subject's decision-making strategy based on a reinforcement-learning algorithm. Our data regarding non-specific neurons and coding of response-outcome concur with this previous study and further suggest that the DLPFC plays an important role in optimizing the behavioural control processes by bridging the information about the response and its outcome in the preceding trial to the current, on-going trial. However, it is still unclear how the post-response activity of the DLPFC neurons influences the neural activity and behavioural responses during the future trials. Further studies using appropriate paradigms and methodologies may clarify this problem and will contribute to an understanding of the neural mechanisms for complex behaviours.
Functional Implication of MGS-specific and VGS-specific Neurons
Of particular interest for the present findings is the task-dependent F2-period activity with selectivity for both direction and condition (outcome); i.e. MGS-specific and VGS-specific neurons. Directional selectivity of these neurons would not depend on the position of fixation, because these neurons showed task-dependent differences in the F2-period activity despite the fact that the monkeys fixated on the same spot in both tasks. Therefore, the F2-period activity of MGS-specific and VGS-specific neurons was modulated by the following three factors: the direction of the preceding response, its outcome (immediate reward or delayed reward) and the information (memory or sensory) used to guide the response. Monitoring and associating these prior events is critical in adapting to dynamically changing environments (Herrnstein, 1961; MacArthur and Pianka, 1966; Platt, 2002) and the DLPFC is thought to play a critical role in effective, organized behaviour according to the given situation (Duncan, 2001; Miller and Cohen, 2001; Tanji and Hoshi, 2001). Our MGS-specific and VGS-specific neurons may be involved in such processes as monitoring and associating the preceding events. Such ‘retrospective monitoring’ would subserve to context-dependent flexible control of goal-directed behaviour.
These findings and interpretations are in line with previous neurophysiological studies. Many previous studies have indicated that DLPFC neurons show activity related to various aspects of goal-directed behaviours, especially those guided by visuospatial information (Fuster, 1973; Kubota et al., 1974; Niki, 1974; Boch and Goldberg, 1989; Funahashi et al., 1989; Hasegawa et al., 1998; Sawaguchi and Yamane, 1999; Takeda and Funahashi, 2002; Fukushima et al., 2004; for reviews, see Funahashi and Kubota, 1994; Goldman-Rakic, 1995; Fuster, 1997), while other studies have reported that DLPFC neurons in monkeys show activity related to reward and/or expectancy of reward (Niki and Watanabe, 1979; Watanabe, 1989, 1996; Leon and Shadlen, 1999; Kobayashi et al., 2002; Watanabe et al., 2002; Wallis and Miller, 2003b; Roesch and Olson, 2003). From these studies, it has been proposed that spatial cognitive information and reward information may be integrated in the DLPFC to control goal-directed behaviour (Watanabe, 1996; Kobayashi et al., 2002; Wallis and Miller, 2003b). In addition, another line of neurophysiological study has shown that a subset of DLPFC neurons encode on-going tasks (Asaad et al., 2000) or abstract rules (White and Wise, 1999; Wallis et al., 2001; Wallis and Miller, 2003a). These neurons may be the basis of adaptive behaviour according to the given on-going rule (Miller, 2000). In line with these previous studies, we demonstrate here that a fraction of DLPFC neurons differentially represent both the preceding spatial response and the reward or non-reward depending on different cognitive spatial information (visuospatial sensory and memory). This suggests that DLPFC neurons contribute to flexible control of behaviour based not only on the on-going rule and context but also on preceding events.
Our findings and interpretations for the MGS-specific and VGS-specific neurons are also consistent with previous neuropsychological and brain-imaging studies in humans as well as monkeys. For example, many neuropsychological and brain-imaging studies have shown that the DLPFC is essential for performance of the Wisconsin Card Sorting Test (WCST) (Milner, 1963; Nelson, 1976; Dias et al., 1997; Monchi et al., 2001; Barcelo and Knight, 2002) and the self-ordered task (Petrides and Milner, 1982; Passingham, 1985; Owen et al., 1996; Collins et al., 1998; Levy and Goldman-Rakic, 1999), both of which require the subjects to monitor and manipulate their own prior choice and its outcome and to select appropriate responses based on such cognitive manipulation. In particular, Monchi et al. (2001), using event-related fMRI, indicated that the DLPFC is activated in the WCST when the subject receives feedback for his/her choice, at the point when the outcome of the subject's choice should be associated with earlier events such as the stimulus chosen. In addition, Goldman-Rakic and her colleague showed that the DLPFC of monkeys plays an important role in spatial, more than non-spatial, self-ordered tasks (Levy and Goldman-Rakic, 1999, 2000), suggesting that the DLPFC is implicated more in monitoring and manipulating choices based on spatial information than on non-spatial information. Our MGS-specific and VGS-specific neurons may be neuronal correlates of such monitoring and integrating process of prior events, particularly those related to prior spatial behaviour. In this sense, our finding that the MGS-specific neurons have a larger population and a higher selectivity than the VGS-specific neurons is consistent with the fact that the DLPFC appears to be more sensitive to the process of integrating the outcome of one's choice with mnemonic information than with sensory information (Petrides, 1991, 2000).
Coding of prior memory and sensory information may not be explicitly needed in our real-life behavioural control processes. One possible explanation for the present results regarding task-specific activity is that the subjects were extensively over-trained in the present task situations, which may result in the task-specific activities. In relation to this point, the task-specificity of neuronal activity might result from the fact that the task condition is an important piece of information for the monkeys when performing the task efficiently, which is critical for the monkeys' survival (i.e. to get water). Alternatively, such coding might be related to monitoring and controlling the neural systems used to guide the response, which is also an important mechanism for efficient cognitive control. Indeed, different neural systems are related to generating the visually guided and memory-guided saccades (Sweeney et al., 1996; Brown et al., 2004). Post-saccadic activity, which has been considered to be an efferent copy or a corollary discharge from the oculomotor centres (Bizzi, 1968; Bruce and Goldberg, 1985; Sommer and Wurtz, 2004; also see Funahashi and Kubota, 1994), has been observed in the DLPFC (Joseph and Barone, 1987; Funahashi et al., 1991) as well as the FEF (Bizzi, 1968; Bruce and Goldberg, 1985) that is connected with the DLPFC (Barbas and Mesulam, 1981; Watanabe-Sawaguchi, 1991). Furthermore, the PFC of rats may play a role in controlling other brain areas (Jackson and Moghaddam, 2001). The task-specific post-response activity observed here may play a role in ‘retrospective monitoring’ of the neural systems used to guide the saccade, thereby contributing to the control of behaviour and/or neural systems.
Relation to the Delay-period Activity
In our previous study (Tsujimoto and Sawaguchi, 2004b), we examined delay-period activity in DLPFC neurons during the MGS and VGS tasks without any different reward conditions and showed that most of the DLPFC neurons with directional delay-period activity exhibited sustained activation during the delay period either in the VGS task only or in both tasks. On the other hand, few neurons in the present sample were accompanied by the directional delay-period activity, which resulted in the absence of change in population activity before saccadic responses. This finding suggests that the present samples of DLPFC neurons are distinct neuronal groups that are different from DLPFC neurons showing directional delay-period activity. Further, contrary to the delay-period activity in the previous study (Tsujimoto and Sawaguchi, 2004b), neurons with MGS-specific activity are more numerous than VGS-specific and non-specific neurons during the post-response F2-period. This difference may imply that, in the complex spatial behaviour, the DLPFC plays a major role in a monitoring or updating process based on the memory rather than in the simple short-term storage, which concurs with a concept proposed particularly by Petrides and colleagues (Petrides, 1998, 2000). Further studies that use tasks with multi-step responses such as a self-ordered task may be able to clarify this problem at the cellular level (Hasegawa et al., 2004).
In the present study, we provide evidence that a fraction of DLPFC neurons encode response-outcome based on the information (memory or sensory) used to guide the response, namely ‘retrospective monitoring’ of behavioural context and/or neural systems. Such neuronal coding should contribute to the ability to select appropriate responses according to a given situation and context that can change dynamically, which has long been considered a critical role of the PFC (Duncan, 2001; Miller and Cohen, 2001; Tanji and Hoshi, 2001). However, because the present behavioural paradigm did not explicitly require the subjects to control their behaviour based on the association of the prior events, it is still unclear whether the neurons shown here are actually used in the behavioural control processes. Nevertheless, the fact that DLPFC neurons show context-dependent coding during a post-response period improves our understanding of the neuronal systems of the DLPFC in the cognitive control of behaviour.
The authors thank K. Watanabe-Sawaguchi and E. Ishida for their assistance with animal care and surgery. This work was supported by a Grant-in-Aid for JSPS fellows (155009927) from the Japan Society for the Promotion of Science to S.T. and by Grants-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science, and Technology to T.S.
1Laboratory of Cognitive Neurobiology, Hokkaido University Graduate School of Medicine, Sapporo 060-8638, Japan and 2Core Research for Evolutional Science and Technology, Japan Science and Technology, Saitama 332-0012, Japan