Temporal prediction (TP) is needed to anticipate future events and is essential for survival. Our sense of time is modulated by emotional and interoceptive (corporal) states that are hypothesized to rely on a dopamine (DA)-modulated “internal clock” in the basal ganglia. However, the neurobiological substrates for TP in the human brain have not been identified. We tested the hypothesis that TP involves DA striato-cortical pathways, and that accurate responses are reinforcing in themselves and activate the nucleus accumbens (NAc). Functional magnetic resonance imaging revealed the involvement of the NAc and anterior insula in the temporal precision of the responses, and of the ventral tegmental area in error processing. Moreover, NAc showed higher activation for successful than for unsuccessful trials, indicating that accurate TP per se is rewarding. Inasmuch as activation of the NAc is associated with drug-induced addictive behaviors, its activation by accurate TP could help explain why video games that rely on TP can trigger compulsive behaviors.
Adaptive organisms must predict future events in time scales that are useful for survival. Temporal prediction (TP), the ability to estimate the short-term evolution of future events, is continuously shaped by reward/aversion learning processes as a function of sensory information (Schultz et al. 1997; Nobre et al. 2007) as well as emotional and interoceptive states (Pollatos et al. 2014). Our sense of time and the temporal relationships between previous events help us to anticipate future events and to make faster responses (Jazayeri and Shadlen 2010). Multiple processes seem to determine our subjective perception of time in the range of hundreds of milliseconds to several seconds, including interoceptive signals in the insular cortex and an “internal clock” or internal timekeeper, presumably located in the basal ganglia and modulated by DA (Buhusi 2003; Coull et al. 2011; Merchant et al. 2013). Specifically, it is hypothesized that interval timing is encoded by coincident activity of cortical neurons that are detected by striatal neurons and conveyed through the thalamus into the cortex (Matell and Meck 2004). Since dopamine (DA) regulates striato-thalamo-cortical circuits, this could explain how DA signaling could modulate time perception (Meck 1986, 2006; Drew et al. 2007; Wiener et al. 2014). However, the neural bases of TP, including its cortical representations, are still poorly understood.
Behavioral and neuroimaging studies have shown that the sense of time is slower in elders and in psychiatric disorders of abnormal basal ganglia function (schizophrenia, Parkinson's disease, attention deficit hyperactivity disorder, and addiction; Allman and Meck 2012; Allman et al. 2014), which is consistent with DA's role in timekeeping operations (Bäckman et al. 2006; Dreher et al. 2008). Similarly, pharmacological studies in animals also support DA's role in timekeeping operations in the striatum (Maricq and Church 1983; Meck 1986; Oprisan and Buhusi 2011). These include the effects of drugs of abuse such as stimulants, which lead to overestimations of time duration (Cheng et al. 2006; Wittmann et al. 2007), or alcohol and cannabinoids, which lead to underestimations of time duration (Tinklenberg et al. 1976).
Accumulating evidence from human and nonhuman primates suggests that DA neurons in the midbrain code reward prediction and support reward learning (Schultz 1998). Temporal difference (TD) learning (Sutton and Barto 1981; Sutton 1988), a differential learning method, has been proposed to explain reward-depending learning (Schultz et al. 1997). Specifically, the firing rate of midbrain neurons appears to mimic the TD error function (Schultz et al. 1997), and functional magnetic resonance imaging (fMRI) studies on appetitive (taste) and aversive (pain) conditioning showed that reward and aversive signals in the ventral striatum and anterior insula seem to mimic the sequential learning signals predicted by TD models (McClure et al. 2003; O'Doherty et al. 2003; Seymour et al. 2004).
Successful TP requires accurate timekeeping (supported by TD reward learning) as well as attention/vigilance and working memory (WM); processes that are coupled because reward learning is associated with attention shifts (Maunsell 2004) supported by WM (Knudsen 2007). Thus, it is important to identify what neural system mediates TP from those that mediate attention/vigilance and WM, processes that are concurrently active during a given TP task. To understand the neurobiology of TP, we asked subjects to predict the onset of circles periodically displayed in the screen while corresponding brain activity was assessed with 4 T fMRI. Along with dissecting TD learning and outcome components from the target onset prediction (TOP) task, control tasks of sensorimotor (SM), spatial attention (SA), and WM contributions to the TOP task were also studied to determine whether the TP activation signature could be distinguished from the control functions occurring concurrently with TP. As we hypothesized, correct performance on the TOP task elicited fMRI activation responses in the ventral striatum [location of the nucleus accumbens or NAc, key reward brain nucleus (Breiter et al. 1997; Breiter and Rosen 1999)] and anterior insula [key cortical hub involved in interoception and timekeeping (Livesey et al. 2007; Wittmann et al. 2010; Bueti and Macaluso 2011)]. Activation in the NAc and anterior insula was significantly greater in the TD learning condition than in the outcome condition of the TOP task, pointing to the importance of these regions for learning associated with reward, as has been shown with reinforcement learning (O'Doherty et al. 2003; Seymour et al. 2004; O'Doherty et al. 2006) and anticipation of uncertain outcomes following prospect theory (Breiter et al. 2001). TP-based activation was significantly greater than any SM, SA, and WM contributions, showing that TP function could be distinguished in a specific fashion from concurrent cognitive processes. Prior reward/aversion studies, including studies of TP have not to our knowledge explicitly shown this segregation from attention and WM functions, which are also modulated by midbrain DA function. This differentiation of TP from attention and WM processes was further accentuated by observation of significant correlation between the speed of the behavioral responses and TD learning activation to errors in the region of the ventral tegmental area (VTA), a source of midbrain DA neurons.
Materials and Methods
The 36 healthy participants (age: 27 ± 6 years, mean ± SD; 34 right handed and 2 left handed; 18 females) in this study were recruited from advertisements in local newspapers. Written informed consent was obtained from all participants prior to the study. Participants were excluded from the study if they had 1) history of major psychiatric or neurological disease; 2) medical conditions that may alter cerebral function (i.e., cardiovascular, endocrinological, oncological, or autoimmune diseases), 3) head trauma with loss of consciousness for >30 min, or 4) use prescribed psychoactive medications. The participants were instructed to discontinue any over the counter medication 2 weeks prior to the study. Food and beverages (except for water) were discontinued at least 4 h prior and cigarettes for at least 2 h prior to the study. Females were scanned in their mid-follicular phase. The study was approved by the Committee on Research in Human Subjects at Stony Brook University.
Prediction Learning Paradigm
During functional scans, participants engaged in 4 consecutive tasks (Supplementary Fig. 1A) involving SM coupling, TOP, spatial WM, and SA. The tasks had identical duration (4 min), number of trials (20), and intertrial interval (ITI = 12 s; 2 s jittering). Button press responses were recorded to determine reaction time (RT; average time difference between targets and responses) and performance accuracy (percentage of successful events relative to the total number of events). In addition, the subject's responses were used for general linear modeling (GLM) of brain activation responses during successful prediction as well as during prediction errors. The next paragraphs describe the trials for each of the 4 tasks in this paradigm.
The SM task involves visual perception and measures the RT required for the subjects to respond to the presence of a target, an open circle covering 10% and 12% of the horizontal and vertical visual fields (Supplementary Fig. 1B). A static white fixation cross was shown at the center of the black screen during 96% of the ITI. During each trial, the target was randomly flashed for 200 ms at 1 of the 4 corners of the screen in the peripheral visual field. The subjects' task was to respond to the appearance of the target by pushing an in-house MRI-compatible response button with their right index finger as quickly as possible upon target presentation. The 600-ms response window that followed the target was used to classify a button press response as successful (RT ≤ 600 ms) or unsuccessful (RT > 600 ms). After an expectation period of 1.3 s, an outcome message, “$” for a hit or “X” for a miss, briefly (500 ms) replaced the fixation cross at the center of the visual field as a feedback, to inform the subjects about their accuracy during the trial; after the outcome only the fixation cross was displayed for 9–11 s, until the onset of the next target.
The TOP task is based on the observation that subsequent TPs are generally correlated in a way that incremental learning allows for dynamic adjustment of the prediction strategy leading to more accurate predictions in the near future. The subject's time predicting ability was optimized using temporal learning, a reinforcement learning model assuming that subsequent predictions are increasingly accurate. Each trial was initiated by a 1-s long cue displaying nonrepeating natural numbers (1–4), randomly, at the 4 quadrants of the central visual field, which revealed the spatial sequence of future events (circles subsequently displayed at the corners of the screen; Fig. 1 and Supplementary Fig. 1B). During the 3-step TD learning module, which started 200 ms after the cue, white circles were subsequently flashed for 200 ms every 1100 ms, one at each of the 4 corners of the visual field, sequentially as indicated by the cue. The subjects were instructed to press a button within a brief 300-ms response window at the onset of the last circle of the cue (i.e., to predict the onset of the target). The subjects were informed that they would not be able to make such fast responses if they waited to see the target, and that in order to achieve faster responses they must develop a target prediction strategy based on the regular onset of the previous circles in the TD learning module. Thus, target prediction was based on the regular timing of the previous events in the 3.3-s long TD learning module (Fig. 1C), which was grounded in TD learning principles. Basically, we assumed that the observation of an event allows reinforcement-based adjustments of brain circuitry that lead to improved predictions of future events. To better predict the onset of the circles, midbrain DA neurons fire in proportion to the TD error function, δ = V− W, the difference between the ideal prediction, V, and the current prediction, W (Fig. 1C). Note that in this simple model we set the TD discount factor to 1 because the time scale of the blood oxygenation level-dependent (BOLD)-fMRI response (∼20 s) prevents assessment of this decay parameter within the length of the TD learning module (3.3 s). The outcome was presented as in the SM task, and followed by the fixation cross that was displayed after the outcome until the onset of the next cue, 4.5–6.5 s. Because the TOP task also involves spatial WM and SA, the proposed battery also includes additional WM and SA tasks to assess the potential confounds of these cognitive domains on TD learning.
The WM task assesses the memory burden of the TOP task, which requires holding in memory the screen locations of subsequent circles until the end of the TD learning epoch. For each trial, a 1-s long cue showing a spatial random (nonrepeating) sequence of the natural numbers 1–4 was displayed for visual memory encoding as in the TOP task (Supplementary Fig. 1B). Subsequently, 200 ms after the cue, circles were flashed at each of the 4 corners of the visual field (WM retrieval epoch), as in the TD learning epoch. However, the appearance of the circles in the screen matched the spatiotemporal sequence suggested by the cue only in 50% of the trials. Subjects were instructed to press the button as fast as possible when the circles appeared in a spatial–temporal sequence that matched the cue; as for the SM task, responses that occurred within 600 ms from the target were considered successful. Then, the outcome and the fixation cross were presented as in the TOP task until the onset of the next cue (4.5–6.5 s after the outcome).
The SA task assesses the vigilance burden of the TOP task, which requires sustained attention to detect the appearance of circles at the corners of the screen during the TD learning epoch. For each trial, a cue displayed a circle, randomly, at 1 of the 4 corners of the central visual field highlighting 1 of the 4 quadrants of the screen (Supplementary Fig. 1B). Subsequently, 200 ms after the cue, circles were flashed randomly and sequentially at each of the 4 corners of the visual field (focused SA epoch), like in the TD learning epoch. The subject was instructed to press a button when the circle appeared at the target quadrant originally highlighted by the cue; as for the SM task, responses that occurred within 600 ms from the target were considered successful. Then, the outcome and the fixation cross were presented as in the TOP task until the onset of the next cue (4.5–6.5 s after the outcome).
During the outcome, a money symbol ($) was used to indicate that responses were accurate (within time window). However, subjects were not rewarded with money for their RT or performance accuracy nor punished for their errors. Subject reimbursement for experiment participation was not affected by their performance. Post task questionnaires were not used to assess how rewarding the task was. The visual stimuli were presented to the subjects on MRI-compatible goggles (Resonance Technology, Inc., Northridge, CA, USA) connected to a personal computer. The software used to display the stimuli and record subject's responses was developed in Visual Basic and Visual C (Visual Studio; Microsoft Corp., Redmond, WA, USA) and was synchronized precisely with MR acquisition using an MRI trigger pulse. Prior to the MRI session, the participants completed a brief training session (5–10 min) on the tasks outside the MRI scanner to ensure that subjects understood and were able to perform the tasks.
MRI Data Acquisition
A 4-T whole-body Varian (Palo Alto, CA)/Siemens (Erlangen, Germany) MRI scanner with a T2*-weighted single-shot gradient echo-planar imaging (EPI) pulse sequence [time echo (TE)/time repetition (TR) = 20/1600 ms, 4-mm slice thickness, 1-mm gap, 35 coronal slices, 64 × 64 matrix size, 3.125 × 3.125 mm2 in-plane resolution, 90° flip angle, 157 time points, 200.00 kHz bandwidth) with ramp-sampling and whole brain coverage was used to collect functional images with BOLD contrast. Padding was used to minimize motion. Earplugs (−28 dB sound pressure level attenuation; Aearo Ear TaperFit 2; Aearo Co., Indianapolis, IN, USA), headphones (−30 dB sound pressure level attenuation; Commander XG MRI Audio System, Resonance Technology Inc., Northridge, CA, USA), and a “quiet” EPI acquisition approach (Tomasi and Ernst 2003) were used to minimize the interference effect of scanner noise during fMRI (Tomasi et al. 2005). Anatomical images were collected using a T1-weighted three-dimensional modified driven equilibrium Fourier transform pulse sequence (TE/TR = 7/15 ms, 0.94 × 0.94 × 1.00 mm3 spatial resolution, axial orientation, 256 readout and 192 × 96 phase-encoding steps, 16 min scan time) and a modified T2-weighted hyperecho sequence (TE/TR = 0.042/10s, echo train length = 16, 256 × 256 matrix size, 30 coronal slices, 0.86 × 0.86 mm2 in-plane resolution, 5 mm thickness, no gap, 2 min scan time) to rule out gross morphological abnormalities of the brain.
Image reconstruction was performed using a phase correction method that produced minimal ghost artifacts (Caparelli and Tomasi 2008). The first 4 imaging time points were discarded to avoid nonequilibrium effects in the fMRI signal. The statistical parametric mapping package SPM8 (Wellcome Trust Centre for Neuroimaging, London, UK) was used for subsequent analyses. Ascending linear interpolation was used for the temporal realignment of the slices (i.e., “slice time correction”), which were acquired in a sequential order. Spatial realignment was performed with a fourth degree B-spline function without weighting and without warping; head motion was <2-mm translations and 2° rotations for all scans. Spatial normalization to the stereotactic space of the Montreal Neurological Institute (MNI) was performed using a 12-parameter affine transformation with medium regularization, 16-nonlinear iterations, and voxel size of 3 × 3 × 3 mm3 and the standard SPM8 EPI template. Spatial smoothing was carried out using an 8-mm full-width at half-maximum (FWHM) Gaussian kernel.
Subject-Level Statistical Analyses
A GLM1 was used to estimate the fMRI responses, independently for each task. For the SM task, the first-level GLM was based on 3 regressors modeling the canonic hemodynamic response function (HRF) elicited by the onsets of the 20 targets, as well as its first- and second-(dispersion) order time derivatives, convolved with low-pass (HRF) and high-pass (cut-off frequency: 1/128 Hz) filters to minimize field drifts and high-frequency noise. The 6 time-varying motion parameters (3 rotations and 3 translations) obtained from image realignment were used as covariates in the GLM to remove motion confounds in brain activation patterns. The same GLM approach was also used to estimate the fMRI responses for the TOP, WM, and SA tasks. The main regressor, however, was based on the onsets of the twenty 3500 ms task epochs (either TD learning, WM retrieval, or focused SA modules). Thus, 4 independent contrast maps reflecting the % BOLD-fMRI signal change from baseline (black screen with a fixation cross) caused by the SM, TOP, WM, and SA cues were obtained for each subject. An additional GLM was used to estimate BOLD-fMRI responses corresponding to the TD learning and the outcome epochs of the TOP task, independently for hits and errors (GLM2). Specifically, GLM2 included 12 regressors, 4 of which modeled the onsets of the TD learning and outcome epochs for successful (hit) and unsuccessful (miss) prediction trials and their first- and second-order time derivatives, which accounted for the remaining 8 regressors. GML2 was convolved with low- and high-pass filtering and the 6 motion covariates as GLM1.
Group-Level Statistical Analyses
One-way within-subject analysis of variance (ANOVA) was used to test the significance of common and differential brain activation signals to SM, TOP, WM, and SA in SPM8 using the BOLD contrast maps resulting from GLM1 (ANOVA1). Similarly, one-way within-subject ANOVA was used to test the significance of common and differential brain activation signals elicited by successful and unsuccessful prediction trials, independently for TD learning and outcome (ANOVA2). All BOLD contrast maps resulting from GLM2 estimations were included in ANOVA2. Voxel-wise SPM8 regression analyses were additionally used to test the linear association of brain activation signals with RT. A cluster-level PFWE < 0.05, corrected for multiple comparisons in the whole brain with the random field theory and a family-wise error correction, was used to test for the main effects of SM, TOP, WM, and SA on brain activation as well as for brain activation differences elicited between the TOP task and the SM, SA, and WM tasks. A cluster-forming threshold P < 0.001 and a minimum cluster size of 100 voxels were used for this purpose. Small volume corrections within a 10-mm spherical volume with cluster-forming threshold P < 0.001 and minimum cluster size of 100 voxels were used to test the statistical significance of hit > miss differential TOP activation and its association with RT in anterior insula, ventral striatum, and in a medial midbrain region that encompasses VTA.
Functional Region-of-Interest Analyses
Functional region-of-interest (ROI) measures were used to report average statistical values or BOLD-fMRI responses in a volume comparable to the image smoothness (e.g., resolution elements or “resels”) rather than single-voxel peak values as well as to test potential linear and nonlinear (u-shaped) relationships between brain activation and RT. The volume of the resels was estimated using the random field calculation in SPM8 as a near-cubic volume with Cartesian FWHM = 15.6, 13.9, and 16.1 mm. Thus, 9-mm isotropic masks containing 27 voxels (0.73 mL) were defined at the centers of relevant activation/deactivation/correlation clusters to extract the average % BOLD signal from individual contrast maps. These masks were created and centered at the precise coordinates listed in Table 1 and were fixed across subjects.
|Regions||BA||MNI coordinates (mm)||k||TOP > SM and WM and SA||SM||TOP||WM||SA|
|Regions||BA||MNI coordinates (mm)||k||TOP > SM and WM and SA||SM||TOP||WM||SA|
Note: Within-subject ANOVA.
As expected, the average RT across subjects was shorter for TOP (0.26 ± 0.08 s; mean ± SD) than for SM (0.64 ± 0.08 s), WM (0.45 ± 0.15 s), and SA (0.46 ± 0.08 s) (P < 0.00001, paired t-tests; Fig. 2). Thus, TD learning enabled significantly faster responses than strategies purely based on perception (SM) as well as those also based on WM and sustained attention (WM and SA). However, the faster responses for the TOP task could partially reflect preparation effects, because the instructions were different for the TOP task than for the SM, WM, and SA tasks. Performance accuracy was better for TOP (77 ± 18%; mean ± SD) than for SM (46 ± 23%) and for WM (95 ± 9%) than for TOP (P < 2 × 10−9; Fig. 2). Accuracy did not show linear increases or decreases and remained stable during the performance of the TOP task (mean accuracy/trial = 77 ± 8%), demonstrating the lack of accumulative learning effects during the TOP task. Accuracy for the WM and SA tasks had ceiling effects (Fig. 2), which suggests the lack of sensitivity to learning effects in these tasks. Indeed, the rates of hits and false alarms did not change over the course of the WM and SA tasks.
Voxel-wise statistical parametric mapping for the TOP task demonstrated positive fMRI responses (activation) in anterior cingulum (dACC), orbital fronto-insular, prefrontal, parietal, temporal and limbic cortices, somatosensory and motor areas, medial visual cortex, cerebellum, thalamus, midbrain, caudate, and NAc; the TOP task also caused negative fMRI responses (deactivation) in lateral occipital and parietal regions, supplementary motor area, paracentral lobule, precentral gyrus, and superior and middle frontal gyri (Supplementary Fig. 2 and Table 1; PFWE < 0.05, corrected for multiple comparisons at the cluster level with the random field theory and a family-wise error).
To identify areas uniquely involved in TP, we contrasted TOP activation responses to the co-activation responses of the SM, WM, and SA tasks using voxel-wise one-way within-subject ANOVA. We found that TP increasingly engaged DAergic regions (NAc, caudate, midbrain, and medial thalamus), anterior insula, dACC and the temporal pole, as well as motor, visual, and somatosensory cortices and disengaged superior frontal, parietal, and inferior and middle occipital cortices (Fig. 3 and Table 1; PFWE < 0.05). These results were consistent with our hypothesis that activation of the insula and DAergic regions is essential for TP.
To quantify the amplitude of the fMRI responses in brain regions involved in TP, we averaged the % BOLD signal changes from baseline within 27-voxel cubic ROIs centered at the cluster locations in Table 1. These ROI analyses confirmed the higher activation/deactivation responses for TOP than for SM, WM, and SA, separately, for all clusters highlighted in Table 1 (P < 0.05). As an example, Figure 3E emphasizes the higher fMRI responses in the right anterior insula and NAc and medial thalamus for TOP than for SM, WM, and SA.
A direct comparison of brain activation caused by TD learning and outcome periods (Fig. 1A) demonstrated that TD learning caused stronger brain activation responses than outcome such that TOP activation was driven by TD learning (Fig. 4). Whereas TD learning engaged the same cortical and subcortical regions that showed TOP activation/deactivation, outcome strongly activated dorsal striatum and deactivated ventral caudate, bilateral pars opercularis (BA 44), and cerebellar vermis (Fig. 4). This suggests differential involvement of brain areas in TD learning from those involved in outcome processing.
Correlation Between RT and TOP activation
Significant correlation between RT and TOP activation responses in the left and right cerebellum (anterior lobe, PFWE < 0.02) and in the right orbital fronto-insular cortex (OFIC, x, y, z = 54, 11, −5 mm; t-score = 3.9, cluster size 86 voxels, PFWE < 0.02; small volume correction within a 10-mm spherical volume) emerged from voxel-wise linear regression analyses (Fig. 5A). These results are consistent with the role of the anterior insula in timekeeping operations such that the stronger the activation in OFIC the faster (i.e., more accurate) the TPs of the targets. The functional ROI analysis confirmed the negative correlation between RT and TOP activation responses in the cerebellum and OFIC (R < −0.37; P < 0.03; Fig. 5A). In addition, linear regression revealed a negative correlation across all tasks and subjects between RT and activation responses in the OFIC and NAc ROIs (R < −0.24; P > 0.005; Fig. 5B), but less so in the cerebellum (R < −0.20; P = 0.02), which suggests variable timekeeping demands across tasks in these regions.
Successful Versus Unsuccessful Prediction
Behavioral and functional responses during successful prediction (hits: 74.1% of the trials) and unsuccessful predictions (errors or misses: 25.9% of the trials) were analyzed for each of the 34 subjects that did not achieve perfect accuracy during TOP task performance but not for the 2 subjects who had RT < 300 ms for all 20 trials (perfect performance) in the TOP task. Naturally, RT was significantly longer for misses (0.52 ± 0.11 s; mean ± SD) than for hits (0.17 ± 0.07 s; P = 5 × 10−14, paired t-test; Fig. 6A).
Voxel-wise within-subject ANOVA showed that, in NAc and medial OFC, brain activation responses to the outcome were significantly lower for miss trials than for hit trials (PFWE < 0.0005; cluster size = 800 voxels; Fig. 6B), finding that was confirmed by the ROI analysis (Fig. 6C). Given the role of NAc in reward processing, this suggests that failure outcomes in TP are associated with negative DAergic reward signaling, or alternatively, that successful outcomes are by themselves rewarding. Within-subject ANOVA also showed that activation responses to TD learning in a medial midbrain region that includes VTA were significantly lower for hits than for misses (PFWE < 0.03; x, y, z = 0, −13, −14 mm; cluster size 90 voxels; peak level t-score = 3.4, small volume corrections within a 10-mm spherical volume; Fig. 6B), which was consistent with our hypothesis on the activation of midbrain DA neurons during errors. Conversely, brain activation responses to TD learning in the cerebellum (vermis) were significantly higher for hits than for misses (PFWE < 0.0005; cluster size = 476 voxels; Fig. 6B). The analysis of functional ROI measures confirmed the findings for cerebellum and VTA (P < 0.02; Fig. 6D).
Linear regression analysis revealed a significant negative correlation between RT and VTA responses to misses, but not to hits during TD learning (PFWE = 0.02; x, y, z = 0, −13, −14 mm; cluster size 38 voxels; peak level t-score = 3.6; small volume corrections within a 10-mm spherical volume), which was confirmed with ROI measures from VTA (R = −0.46, P < 0.006; Fig. 7) and is also consistent with activation of midbrain DA neurons during errors. Furthermore, direct comparison of linear regressions for misses and hits revealed a success × RT interaction effect on VTA responses to TD learning (P = 0.036, slope; P < 0.0005, intercept) such that increased VTA activation was associated with faster responses for unsuccessful (miss) trials (no association between RT and VTA responses was found for hits). Similar correlations for the cerebellum did not reach significance (Fig. 7).
The timekeeping operations required for precise TP of event onset engaged a reward-motivation network (Fig. 4D) that comprised regions modulated by the DA mesolimbic and mesocortical pathways including anterior insula, NAc, ventral caudate, and medial thalamus (Wise 2002; Breiter and Gasic 2004; Kelley et al. 2005; Liu et al. 2011; Haight and Flagel 2014). These observations suggest that accurate TOP performance is experienced by subjects as rewarding. Specifically, the novel TOP task caused stronger activation than the SM, WM, and SA tasks in regions involved in salience attribution (OFIC and dorsal anterior cingulum; Seeley et al. 2007) and DAergic nuclei in the basal ganglia (ventral caudate, NAc, and midbrain) and middle thalamus. These findings are consistent with the modulatory role of DA in the feedback control of interval timing (Lustig and Meck 2005; Meck 2005) and with the effects of DAergic drugs on the speed of the internal clock (Lake and Meck 2013).
RT, the time difference between target onsets and button press responses, was significantly shorter for the TOP task than for the SM, WM, and SA tasks (Fig. 2). Previous studies have shown that temporal processing skills include time perception in the form of verbal time estimation and motor timing, the ability to temporally adjust a motor response to sensory stimulation (Rao et al. 2001). Subject's responses were 2.5 times faster for the TOP task compared with the perceptual motor task (SM), which demonstrates that subjects anticipated their responses in a way that the identification and interpretation of visual information necessary for the perception of the target was incomplete at the time of the response. Thus, processing speed was not based on perception and the speed of motor responses during the TOP task. Whereas RT could also depend on attention and WM skills (Livesey et al. 2007), the responses in the present study were significantly faster for the TOP task than for the WM and SA tasks. This suggests that subjects' prediction abilities were not supported solely by attention and WM processing based on the spatial information embedded in the cues. We propose that, in the present study, TP was based on incremental learning based on the temporal information of events in the TD learning module. However, since there were no behavioral changes over the course of tasks (habituation/sensitization effects or reinforcement learning), further work is needed to test the hypothesis that TP is based on incremental learning. It is noteworthy too that despite responses were significantly faster, accuracy was significantly higher for the TOP task than for the SM task (Fig. 2), which underscores the value of prediction for improving performance.
The engagement of the anterior insula and the basal ganglia during TP is consistent with previous studies on time perception and supports an important role of these brain regions in timekeeping. Previous studies have shown activation in middle and posterior insula during the reproduction of time durations of auditory and visual stimuli, consistent with the role of the insula in time encoding (Livesey et al. 2007; Wittmann et al. 2010; Bueti and Macaluso 2011). Recent studies on attention to interoceptive processes showed that fear caused subjective time dilation, suggesting that experience of time emerges from emotional and interoceptive signals in the insular cortex (Pollatos et al. 2014). Here, we show a correlation between activation responses in the anterior insula and RT during the TOP task (Fig. 5). This finding is also consistent with the accumulation function of the anterior insula, which postulates that our sense of time reflects the accumulation of physiological changes in body states (Wittmann et al. 2010; Bueti and Macaluso 2011). The anterior insula is also involved in TD learning (Sutton and Barto 1981; Sutton 1988) during Pavlovian conditioning. The TOP task contains a TD learning module (Fig. 1B), which allows reinforcement of the prediction circuitry to accurately forecast target onsets. Thus, the activation of the anterior insula during the TOP task, but less so during SM, WM, and SA, suggests an important role of reward learning on event onset forecasting during the TD learning module. The correlation between activation responses in the anterior insula and RT across tasks and subjects (Fig. 5B) suggests differential timekeeping reinforcement for SM, SA, WM, and TOP that increase in proportion to the predictability of the events.
The basal ganglia have been implicated in WM and some attention processes (Schroll et al. 2012). This study is the first to test the involvement of the basal ganglia in TP while controlling for other concurrently active mental processes (WM, SA, and SM). The stronger activation of the ventral striatum (caudate and NAc) for the TOP task than for the SM, WM, and SA tasks is consistent with the involvement of basal ganglia in timing and reward learning (Rao et al. 2001; Schultz 2007). Converging evidence from psychopharmacological studies as well as studies in animals, healthy volunteers and patients with Parkinson's disease, schizophrenia, or attention deficit disorder implicate basal ganglia pathways in timekeeping (Maricq and Church 1983; Jahanshahi et al. 2006; Jones et al. 2008; Carroll et al. 2009; Rubia et al. 2009; Allman and Meck 2012; Coull et al. 2012; Noreika et al. 2013). The NAc, an important nucleus for reward and for prediction of reward related activity, is involved in procedural learning and in reward learned associations (Saddoris and Carelli 2014). Indeed, studies on Pavlovian conditioning and decision-making implicate the ventral striatum in reward learning (O'Doherty et al. 2003; Rolls et al. 2008). Activation in the ventral striatum also showed significant correlation with RT across tasks and subjects (Fig. 5B), supporting differential reinforcement for SM, SA, WM, and TOP.
Midbrain neurons process rewarding and reward predicting stimuli, and electrophysiological data from alert monkeys (Mirenowicz and Schultz 1994) showed that the firing rate of midbrain neurons is associated with the error function in the TD learning algorithm (Schultz et al. 1997). In the present study and compared with successful predictions, prediction errors caused decreased NAc activation to outcome and increased midbrain activation to TD learning (Fig. 6C), such that greater VTA activation was associated with less inaccurate predictions (i.e., shorter RT; Fig. 7). This is consistent with the role of the NAc in reward processing and with the increased involvement of the midbrain in learning (D'Ardenne et al. 2008; Glimcher 2011; Iglesias et al. 2013), and suggests that VTA is increasingly activated during error trials to improve the accuracy of the predictions. Here, we propose reward learning as a potential mechanism behind the increased activation of VTA to errors compared with that of successful predictions (Fig. 1C). Previous studies have shown that midbrain neurons fire in proportion to the TD error function (Schultz et al. 1997), δ, which in this study reflects the time difference between sensory information corresponding to the onsets of the circles at t1, t2, and t3 (V; correct prediction) and the time predictions of these onsets (W). We assume that if the TD learning module is sufficiently prolonged, DAergic reinforcement of accurate predictions should continue until time differences between predictions and events are null (successful prediction or hit). For a time-limited TD learning module, such as the one in this work, successful prediction can be achieved (hit) or not (miss) during each trial. Since the fMRI response is proportional to the time integral of the firing rate (Logothetis et al. 2001; Sander et al. 2013), activation responses of DA neurons in VTA should be higher for miss than for hit trials as shown in our data (Fig. 6D).
Accurate Prediction and Reward
The last decade has seen a proliferation of video games that can be highly rewarding and can result in compulsive gaming. Video gaming reward requires accurate performance, which in many instances needs prediction of the onsets of the visual stimuli to respond prior to target detection. Our simple task emulates some of these properties and shows that TP activates the mesolimbic pathway, which is consistent with previous studies that associated video gaming performance with striatal volume (Erickson et al. 2010), activation of the ventral striatum (Hoeft et al. 2008; Kühn et al. 2011), and striatal DA release (Koepp et al. 1998). In addition, the higher activation to the outcome during accurate versus inaccurate performance highlights the rewarding nature of accurate performance in and of itself, which further reinforces behaviors in video gaming. Inasmuch as compulsive gaming behavior emerges in some individuals that play these video games, the performance-dependent reward and concomitant activation of the DA pathways could underlie the addiction patterns reported with video gaming.
Here, we show a role of the DA meso-accumbens pathway and its connection to insula in time prediction. Our findings on NAc activation in the absence of explicit reward stimuli also provide evidence that accurate task performance in and of itself is rewarding, which could underlie the motivation triggered by challenging activities.
This research was supported by the National Institute of Health's Intramural Research Program (National Institute on Alcohol Abuse and Alcoholism; 2RO1AA09481) and was carried out using the infrastructure of Brookhaven National Laboratory under contract DE-AC02-98CH10886.
We thank Millard Jayne, Frank Telang, Pauline Carter, and Barbara Hubbard for subject care and protocol oversight; Karen Apelskog for protocol coordination; Ruiliang Wang for MRI data acquisition; and the reviewers for constructive criticism and meaningful suggestions. We also thank the subjects who volunteered to participate in this study. Conflict of Interest: None declared.
- basal ganglia
- addictive behavior
- nucleus accumbens
- precipitating factors
- time perception
- ventral tegmental area
- video games
- compulsive behavior
- functional magnetic resonance imaging
- brain activity
- anterior insula