It is well known that formation of new episodic memories depends on hippocampus, but in real-life settings (e.g., conversation), hippocampal amnesics can utilize information from several minutes earlier. What neural systems outside hippocampus might support this minutes-long retention? In this study, subjects viewed an audiovisual movie continuously for 25 min; another group viewed the movie in 2 parts separated by a 1-day delay. Understanding Part 2 depended on retrieving information from Part 1, and thus hippocampus was required in the day-delay condition. But is hippocampus equally recruited to access the same information from minutes earlier? We show that accessing memories from a few minutes prior elicited less interaction between hippocampus and default mode network (DMN) cortical regions than accessing day-old memories of identical events, suggesting that recent information was available with less reliance on hippocampal retrieval. Moreover, the 2 groups evinced reliable but distinct DMN activity timecourses, reflecting differences in information carried in these regions when Part 1 was recent versus distant. The timecourses converged after 4 min, suggesting a time frame over which the continuous-viewing group may have relied less on hippocampal retrieval. We propose that cortical default mode regions can intrinsically retain real-life episodic information for several minutes.
Hippocampal damage dramatically impacts episodic memory (Scoville and Milner 1957; Milner et al. 1968), but estimates of how long new information can be maintained without hippocampal involvement, and under what conditions, are mixed (Ranganath and Blumenfeld 2005). Modern experiments rely heavily on isolated items with arbitrary relationships in order to elicit hippocampal activity (Cohen et al. 1999; Brown and Aggleton 2001; Davachi et al. 2003; Giovanello et al. 2004; Köhler et al. 2005); under these conditions, the hippocampus is recruited for information retrieval at delays as short as a few seconds (Hannula et al. 2006, 2007; Yee et al. 2014). This observation is consistent with findings from the neuropsychological literature, where hippocampal amnesic patients may show impairments in retaining random words or pictures over delays as short as a few seconds (Aggleton et al. 1992; Holdstock et al. 1995). In contrast, under richer, more natural conditions, information can be sustained for a few minutes without reliance on the hippocampus. For example, hippocampal amnesics can retain stimulus information long enough to carry on a coherent conversation (Scoville and Milner 1957; Milner et al. 1968), successfully recall prose (Wilson and Baddeley 1988; Baddeley and Wilson 2002), and engage in complex communicative games (Duff et al. 2005). Such observations suggest that information embedded within a meaningful continuous context (e.g., a conversation or story) can persist over time without relying on the hippocampus. This raises the question: what neural systems outside of the hippocampus might support the retention of information for many minutes?
Certain cortical areas display long processing timescales during meaningful auditory and visual narratives (Hasson et al. 2008; Lerner et al. 2011; Honey, Thesen, et al. 2012), that is, their moment-to-moment responses are systematically influenced by information from minutes earlier. The cerebral cortex exhibits a hierarchical structure of processing timescales: At the lowest level, areas with “short timescales” (mainly in sensory cortices) have sensitive periods shorter than a second, whereas at the highest level, areas with “long timescales” show the influence of prior stimulus information over minutes. The areas with long timescales overlap broadly with the “default mode network” (DMN) (Raichle et al. 2001; Hasson et al. 2010), a set of anatomically interconnected cortical regions including posterior medial cortex, medial prefrontal cortex, middle temporal gyrus, and angular gyrus. Numerous studies implicate these areas in long-term memory (LTM) encoding and retrieval (Kim 2010; Rugg and Vilberg 2013); they are also functionally and anatomically connected to the hippocampus (Kahn et al. 2008; Aggleton 2012). The long-timescale properties of these cortical regions make them prime candidates to support the retention of information across minutes, perhaps in concert with the hippocampus and perhaps to some degree independently of the hippocampus.
To investigate the roles of long-timescale cortical regions and the hippocampus in processing stimulus information across minutes of time, we designed an experiment to manipulate the remoteness of memories needed during a continuous narrative. We showed subjects a two-part audiovisual movie during fMRI scanning (Fig. 1a). Access to information from the first half of the movie was manipulated in the following way: One group viewed both parts of the movie without breaks, another group watched the second part of the movie without ever watching the first part, and a third group viewed the first part of the movie 1 day earlier than the second part. Critically, the second part of the movie was identical for all subjects, and we analyzed only the neural responses to this part of the movie. This design enabled us to explore how neural responses to the same naturalistic information stream (the second part of the movie) changed when prior relevant information (the first part of the movie) was seen either a few minutes ago, never, or 1 day prior.
In this paper, we begin by demonstrating that the second part of the movie elicits reliable response timecourses across subjects in cortical areas with long timescales. Then, we show that the response reliability in long-timescale regions is associated with subsequently measured movie comprehension and that it is dependent on memories of the first part of the movie. Finally, we ask to what degree this influence of the past depends on interactions between long-timescale cortical regions and the hippocampus when memories are from a) a few minutes ago, versus b) 1 day ago. Retrieval of events from 1 day prior must rely heavily on the hippocampus. When the same events occurred just a few minutes ago, is the hippocampus equally recruited? We show that, during the second part of the movie, access to memories from the recent past (minutes ago) elicits less hippocampal-cortical interaction than access to the same events from the more distant past (1 day ago). Furthermore, long-timescale cortical dynamics persistently differ for a few minutes before converging, suggesting that these regions carry different information when memories come from a few minutes ago versus 1 day ago. We propose that, in the context of a continuous narrative, long-timescale regions are able to intrinsically retain some information from the recent past across minutes.
Materials and Methods
The audiovisual movie used for the main experiment was an episode of “The Twilight Zone” entitled “The Lateness of the Hour,” 1960 (black-and-white, 25 min long). The first 15 min are referred to as “Part 1,” and the remaining 10 min are referred to as “Part 2.” This division was chosen to coincide with a scene break. The movie was specifically selected such that Part 2 would be difficult to understand without having seen Part 1. For example, in Part 1 of the movie, the viewer learns that several of the characters are robots that appear to be human, but after Part 1, this fact is not mentioned again until Minute 5 of Part 2; viewers who never saw Part 1 would have difficulty understanding the motives of these characters during Part 2.
Sixty volunteers (34 female), all native English speakers with normal or corrected-to-normal vision, participated. Of these, 9 subjects were excluded: 3 for motion in excess of 3 mm, 2 for falling asleep during the scan, 2 for signal quality problems, and 2 for discomfort during the scan. Informed consent was obtained according to procedures approved by the Princeton University Internal Review Board for all subjects reported in this paper.
Subjects in the recent memory (RM) condition (N = 24) watched the entire 25-min movie (Part 1 and Part 2) continuously in a single scan. Subjects in the distant memory (DM) condition (N = 14) first attended a behavioral session during which they watched Part 1, after which they were instructed not to talk or read about the movie until the next session; the next day they watched Part 2 in a single scan. Subjects in the no memory (NM) condition (N = 13) watched Part 2 in a single scan without ever having seen Part 1 (Fig. 1a). After the movie, all subjects (with the exception of 2 RM subjects) listened to an auditory story (“Pie-man,” 7 min, see Lerner et al. 2011), and some subjects (18 in RM, 10 in DM, 12 in NM) additionally listened to a scrambled version of the same story as part of a separate experiment (data not reported here). All subjects were administered a memory test outside of the scanner after the session. Mean age and age ranges for the groups were as follows: RM, mean 23.0, SD 3.5, range 19–31, 13 of 24 female; DM, mean 22.7, 9 of 14 female, SD 5.0, range 18–33; NM, mean 21.0, SD 4.2, range 18–31, 6 of 13 female.
During scanning, the movie was presented using an LCD projector onto a rear-projection screen located in the magnet bore and viewed with an angled mirror. The Psychophysics Toolbox [http://psychtoolbox.org] in MATLAB was used to display the images and synchronize the movie onset with MRI data acquisition. Audio for the movie was delivered via in-ear headphones. Eyetracking was conducted using the iView X MRI-LR system (SMI Sensomotoric Instruments). No behavioral responses were required from the subjects during scanning, but the eyetracking camera allowed the experimenter to monitor the subjects' alertness. Any subjects who appeared to fall asleep, as assessed by video monitoring, were excluded from further analyses.
Postscan Memory Test
The postscan questionnaire (Fig. 1b) was constructed from a set of 25 free-response written questions (Supplementary Table 1) that were designed to probe specific times/events in the movie (e.g., “What did the daughter drop on the floor and break?”), as well as 6 general questions not tied to a single event (e.g., “What was the father's name?”). Of the 13 questions that probed events from Part 2 of the movie, 6 covered Minutes 1–3, 5 covered Minutes 4–6, and 2 covered Minutes 7–9. Each test question was scored by a reader blind to condition using a 1–3 scale: 1 =no answer or answer completely incorrect; 2 = incorrect but semantically similar (e.g., the correct answer is “normal people,” subject responded “regular people”); 3 = correct. While the goal of this test was to probe story comprehension/memory for specific times in the movie, questions about early events in the movie might be answered based on information gathered later on, severely restricting the types of questions that could be asked. For example, NM subjects likely did not know the characters' motives in the first minute of Part 2 (having no prior experience with the story), but they were likely able to figure them out from later events. To overcome the inherent limitations of the postscan memory test, we ran an additional “stop-and-ask” experiment (Fig. 1c).
A separate behavioral experiment was conducted to probe memory-based comprehension during the course of Part 2 of the movie (Fig. 1c). The same audiovisual movie was used as in the main experiment. Twenty-nine volunteers, all native English speakers with normal or corrected-to-normal vision, participated.
The 3 experimental conditions were defined exactly as in the main experiment (RM, n = 12; DM, n = 7; NM, n = 10). Subjects viewed the movie on a computer monitor and listened through headphones. The movie was presented using the Psychophysics Toolbox in MATLAB. The subset of subjects who viewed Part 1 (the RM and DM groups) viewed it without interruptions. During Part 2, the video was occasionally paused in order to display a comprehension question. At each pause, the screen went blank and a text question was displayed, along with 4 possible answers. Subjects had as long as they wished to read the questions and answers and to select an answer using the keyboard. Upon selection, the movie resumed exactly where it had left off. On average, these questions occurred 4.7 times per minute throughout the movie. There were 49 questions total, 4 of which were catch trials distributed randomly throughout Part 2 to ensure that subjects were attending to the task (Supplementary Table 2). The questions were designed to probe comprehension of the narrative specifically at the time of the question, as opposed to more general postviewing comprehension of the movie. To this end, each question asked the subject what was likely to happen next in the movie, that is, immediately after the video resumed. For many of the questions, answering correctly required knowledge of events from Part 1 of the movie. For example, in the question at 138.6 s, “What is the man going to say next?,” the choices are “George,” “Mr. Fowler,” “Robert,” and “Dr. Loren.” Subjects who remembered events from Part 1, and were thus already familiar with the characters, should be able to identify the person about to be addressed as “Robert,” whereas subjects without memories of Part 1 should not.
MRI data were collected on a 3T full-body scanner (Siemens Skyra) with a 16-channel head coil. Functional images were acquired using a T2*-weighted echo planar imaging pulse sequence (TR 1500 ms, TE 28 ms, flip angle 64, whole-brain coverage 27 slices of 4-mm thickness, in-plane resolution 3 × 3 mm2, FOV 192 × 192 mm2), ascending interleaved. Anatomical images were acquired using a T1-weighted MPRAGE pulse sequence (0.89-mm3 resolution). Anatomical images were acquired in an 8-min scan prior to the functional scan, during which time subjects watched a nature documentary (BBC's “Life”).
Preprocessing was performed in FSL (http://fsl.fmrib.ox.ac.uk/fsl), including slice time correction, motion correction, linear detrending, high-pass filtering (140 s cutoff), and coregistration and affine transformation of the functional volumes to a template brain (MNI). Functional images were resampled to 3-mm isotropic voxels for further analyses. To ensure that comparisons across groups reflected conditions where all subjects were exposed to identical stimuli, 3 volumes were dropped from the beginning of Part 2 (7.5 s of the movie) for all subsequent analyses. This step eliminated volumes that might, due to hemodynamic response delay, correspond to the last few seconds of Part 1 in the RM group.
Intersubject correlation (ISC) and statistical analyses were performed using in-house software written in MATLAB (Mathworks). ISC is the correlation of BOLD activity timecourses across subjects viewing/listening to a common visual/auditory stimulus (Hasson 2004; Hasson et al. 2010). When multiple subjects are exposed to the same continuous stimulus (e.g., a movie), similar response timecourses are observed across subjects in brain regions that process the information contained in the stimulus. This response reliability is absent in brain regions for which the stimulus is irrelevant, for example, low ISC is observed in visual cortex when the stimulus is purely auditory. Ongoing activity can also be compared between groups exposed to differing conditions, in order to evaluate the effect of those condition differences on response dynamics over time.
ISC within a group (“reliability”) is calculated as an average correlation at each voxel, where each rj is the Pearson correlation between that voxel's BOLD timecourse in 1 individual subject and the average of that voxel's BOLD timecourses in the remaining individuals in the group.
ISC between groups is calculated as an average at each voxel, where each is the Pearson correlation between that voxel's BOLD timecourse in the jth individual from the first group and the average of BOLD timecourses of all individuals in the other group. Unlike traditional GLM analysis, the ISC method does not assume a prototypical response profile for specific stimulus events. Instead, brain responses from one subject are used as a model to predict brain responses to the same content in another subject, at any given voxel.
All ISC maps were calculated in volume space. Projections onto a cortical surface for visualization were performed with NeuroElf (http://neuroelf.net).
All t-tests reported in the paper are two-tailed unless otherwise indicated.
A timescale localizer scan was used to delineate short-, medium-, and long-timescale ROIs (Fig. 2a) following procedures established in previous studies of hierarchical timescales in the cortex (Hasson et al. 2008; Lerner et al. 2011; Honey, Thesen, et al. 2012), in a separate group of subjects. The audiovisual movie used for the localizer was a 325-s clip from the 1975 commercial film Dog Day Afternoon (Lumet 1975); this movie was presented either intact, coarsely scrambled (7.1–22.3-s segments), or finely scrambled (0.5–1.6-s segments) in time. ISC for each voxel in each condition was assessed, and a voxel was defined as 1) short timescale if it had above-threshold ISC in all 3 conditions, 2) medium timescale if it had above-threshold ISC in the fine and coarse scramble conditions but not in the intact condition, and 3) long timescale if it had above-threshold ISC for only the intact condition. For more details about the timescale localizer, see Supplementary Material.
In addition to the categorical assignments, each voxel was assigned a “timescale index” by subtracting ISC in the coarse scramble condition from ISC in the intact condition. The timescale index is a continuous variant of the categorical timescale assignments. We used intact-coarse rather than intact-fine in order to have better sensitivity in the medium-to-long timescale range.
In addition to the timescale ROIs, an anatomical hippocampus ROI was defined based on the probabilistic Harvard-Oxford Subcortical Structural atlas (Desikan et al. 2006), and 2 ROIs (posterior cingulate cortex [PCC] and medial prefrontal cortex [MPFC]) were taken from an atlas defined from resting-state connectivity (Shirer et al. 2012).
The DMN (Fig. 2b) was mapped by calculating functional connectivity (within-subject correlation) between the PCC ROI and every other voxel in the brain for each subject separately using data from the Intact condition of the timescale localizer (the average of 2 Intact runs). Brain maps were averaged across subjects and thresholded at R > 0.4. This map was used only for the visualization of the DMN and did not enter any other analyses.
Voxel-Level ISC Comparisons between Conditions
The RM group was randomly split into 2 groups (N = 12 each) for internal replication analyses. ISC maps were created for each of the 4 groups (DM, NM, RM1, and RM2) for each of the 5 nonoverlapping 2-min windows spanning Part 2 of the movie. The statistical likelihood of each observed correlation was assessed using a bootstrapping procedure based on phase-randomization, and maps were corrected for multiple comparisons using nonparametric family-wise error rate, as described in Regev et al. (2013). The number of voxels above threshold in each group in each timescale ROI in each 2-min time window was computed and submitted to a Group × Time ANOVA. If a significant Group × Time interaction was found, indicating that effects differed across time windows, one-way ANOVA was performed within each time window. If a significant effect of Group was found in this ANOVA, post hoc t-tests were performed to determine how the groups differed from each other (Supplementary Fig. 1). Voxels that were above threshold during the first 2-min window of Part 2 within the RM and DM groups, but not in the NM group, are displayed in Figure 3a. Within each timescale ROI, the number of above-threshold voxels was computed for a range of thresholds (R = 0.01, 0.05, 0.10, 0.15) for each condition (Fig. 3b).
ISC was calculated within hippocampal voxels for each condition, and the number of above-threshold voxels was computed for a range of thresholds (R = 0.01, 0.05, 0.10, 0.15; Fig. 4a,b). The number of voxels above threshold in each group in the hippocampus ROI in each 2-min time window was computed and submitted to a Group × Time ANOVA. ISC was also calculated at the ROI level by averaging across all voxels in the ROI and then calculating correlations across subjects within-group, for both hippocampus and for the long-timescale ROI (Fig. 4c,d). To compare hippocampal ISC with ISC in long-timescale regions, we performed a Group (NM/RM1/RM2/DM) × Time (five 2-min windows) × ROI (hippocampus/long-timescale ROI) ANOVA.
Correlations between hippocampus and other brain regions were calculated using intersubject functional correlation (ISFC). ISFC differs from standard functional connectivity analysis in that correlations are calculated across brain regions across subjects, rather than within subject. This technique isolates stimulus-locked activity from background correlations and has been shown to differentiate between rest, intact story, and scrambled story conditions when within-subject correlation analyses cannot (Simony et al. 2012). Hippocampal ISFC was calculated within each group by first averaging timecourses for all voxels within the hippocampus ROI separately for every individual, then calculating the correlation of each individual's hippocampal timecourse with the average timecourse of all others in the group at each nonhippocampal voxel, then averaging across individuals at each nonhippocampal voxel (Fig. 5a).
The statistical likelihood of each observed correlation was assessed using a bootstrapping procedure based on phase randomization. The null hypothesis was that the hippocampal timecourse in each individual was independent of the timecourse in every voxel outside the hippocampus in any other individual. Each nonhippocampal voxel timecourse was phase randomized by applying a fast Fourier transform to the timecourse, randomizing the phase of each Fourier component, and inverting the Fourier transformation. This procedure scrambles the phase of the timecourse but preserves its power spectrum. For each randomly phase-scrambled surrogate data set, we computed the hippocampal ISFC for all nonhippocampal voxels in the exact same manner as the empirical correlation maps described earlier, that is, by calculating the correlation between each individual's hippocampal timecourse and the average timecourse of all others in the group at each nonhippocampal voxel. The resulting correlation values were averaged within each voxel across all subjects, creating a null distribution of average correlation values for all voxels.
To correct for multiple comparisons, we selected the highest value from the null distribution of all voxels in a given iteration. We repeated this bootstrap procedure 1000 times to obtain a null distribution of the maximum noise correlation values. We controlled the family-wise error rate at alpha = 0.05 by setting a threshold (R*) such that R* was only exceeded by the top 5% of the null distribution of maximum correlation values; this R* value was used to threshold the empirical map (Nichols and Holmes 2002). In other words, in the hippocampal ISFC map, only voxels with a mean correlation value (R) above the threshold derived from the bootstrapping procedure (R*) were considered significant after correction for multiple comparisons and were presented on the final map. ISFC between hippocampus and timescale ROIs (see the section Timescale Localizer) was assessed by first calculating hippocampal ISFC for all voxels, then averaging across voxels within the desired ROI. The timescale ROIs did not overlap with any hippocampal voxels.
The average hippocampus ISFC value across voxels in the long-timescale ROI was computed for each condition (Fig. 5b). At the ROI level, hippocampus ISFC with the PCC and mPFC ROIs was calculated by first averaging timecourses across voxels within-ROI and then computing correlations between ROIs across subjects (Fig. 5c).
Between-Group ISC in Timescale ROIs
The RM group was randomly split into 2 groups, RM1 and RM2 (N = 12 each). As all RM subjects were recorded under identical experimental conditions, the between-group ISC of these groups (RM1∼RM2) estimated the maximum possible ISC between groups in any given time window.
Between-group ISC was calculated in sliding windows over the duration of Part 2, yielding RM1∼RM2, DM∼RM2, and NM∼RM2 values at every voxel. These ISC values were then averaged within-ROI for the short-, medium-, and long-timescale ROIs (Fig. 6). The sliding window was 120 s wide and center-plotted, that is, the first value in the trace corresponds to time window 0–120 s. The window was chosen to be short enough to reveal changes in ISC across the duration of Part 2, but long enough to contain an adequate number of samples for temporal correlation analysis (80 TRs). To determine at what times during Part 2 ISC values significantly differed, we tested windows at 60-s intervals throughout Part 2 (i.e., each tested 120-s window overlapped with its neighbors by 60 s). For each window, a null distribution of ISC values was created by scrambling the labels 10,000 times, for example, in the RM1∼RM2 versus DM∼RM2 comparison, each subject's ISC value was randomly assigned to the RM1∼RM2 or DM∼RM2 group, and the difference in ISC was computed across the random groups. Using this null distribution of ISC differences, a P-value was calculated (one-tailed test) for each empirically measured ISC difference. These P-values were then corrected for multiple comparisons by controlling the false discovery rate (FDR, Benjamini and Hochberg 1995) with a q-threshold of 0.05.
Comparison of RM versus DM Map to Timescale Map
Dissimilarity between DM and RM dynamics (DM∼RM2 vs. RM1∼RM2) was computed at every voxel in the brain using a two-tailed t-test (Fig. 7a). A mask was made consisting of every voxel that both 1) responded reliably during the movie and 2) was present in any of the 3 timescale ROIs. Within this mask, timescale index (see the section Timescale Localizer) was calculated for every voxel and plotted against the dissimilarity of DM versus RM (Fig. 7b).
Early Hippocampal-Cortical Coupling versus Later Cortical Activity
Hippocampal-cortical interactions during the “early window” (Minutes 1–4 of Part 2) was measured by calculating ISFC between the hippocampus and all voxels in the long-timescale ROI for each subject in the DM group, then averaging across voxels. In the “late window” (Minutes 5–10 of Part 2), we calculated the similarity (correlation) of each subject's long-timescale ROI timecourse with the average timecourse in the same ROI in the RM groups (either RM1 or RM2). We then calculated the correlation was between early-window hippocampal-cortical interaction and late-window cortical similarity. See Supplementary Methods and Results for control analyses.
Three groups of subjects viewed the second part (Part 2, Minutes 16–25) of an audiovisual movie, an episode of “The Twilight Zone” (Smight 1960), during fMRI scanning (Fig. 1a). One group viewed Part 1 (Minutes 1–15) and Part 2 consecutively without breaks (they had “Recent Memory” of Part 1, N = 24), another group watched Part 2 of the movie without ever watching Part 1 (they had “No Memory” of Part 1, N = 13), and a third group had a 1-day break between watching Part 1 and Part 2 of the movie (they had “Distant Memory” of Part 1, N = 14). The “Recent Memory” group was randomly divided into halves (“Recent Memory 1” and “Recent Memory 2,” N = 12 and N = 12) in order to allow unbiased between-group comparisons in certain analyses; replicated results across the 2 independent “Recent Memory” groups are shown throughout.
Two behavioral tests were conducted to assess subjects' comprehension of events at different times during the movie. First, a postscan questionnaire consisting of 25 free-response questions, 13 of which probed events from Part 2 (see Materials and Methods and Supplementary Table 1). While there was no difference in performance between the RM and DM groups, a significant deficit was found at the beginning of Part 2 for the NM group (F2,47 = 3.85, P < 0.05; post hoc t-tests, P < 0.05, Figure 1b). No significant performance differences were found for the remainder of the session.
We were especially interested in how memories of Part 1 were called on at different times during Part 2. The postscan test may have underestimated differences between groups, as subjects in the NM condition might have used knowledge gathered toward the end of Part 2 to answer questions about events that happened at the beginning of Part 2. To overcome this limitation, we ran an additional “stop-and-ask” comprehension test with 29 new subjects. In this “stop-and-ask” experiment, subjects viewed Part 2 with pauses (a few times per minute) for multiple-choice comprehension questions. The questions were designed specifically to probe for comprehension of events relying on information from Part 1, for example, comprehension of motives for a character's actions during Part 2 that were explained during Part 1 (see Materials and Methods and Supplementary Table 2). The “stop-and-ask” test revealed significantly reduced memory-based comprehension for NM (n = 10), lasting throughout the movie, with no significant differences between RM (n = 12) and DM (n = 7) at any point (Fig. 1c). While the number of subjects in this analysis was relatively small and thus the results should be treated cautiously, the pattern suggests that, because subjects in the NM group did not see Part 1 of the movie, they had impaired comprehension throughout Part 2 compared with subjects in the other groups.
Areas with short, intermediate, and long processing timescales were defined using an independent localizer based on temporal scrambling of a movie (Fig. 2a; see Materials and Methods). As in previous studies (Hasson et al. 2008; Lerner et al. 2011; Honey, Thesen, et al. 2012), this method revealed a hierarchy of timescales across a large portion of the cortex, with short timescales in early sensory regions gradually transitioning to long timescales in high-order regions. Long-timescale areas overlapped broadly with the “default mode network” (Raichle et al. 2001; Hasson et al. 2010) (Fig. 2b). Long-timescale areas were predicted to exhibit sensitivity to the experimental manipulations of Part 1 of the movie (never seen, seen recently, or seen 1 day ago). Short-timescale areas were expected to be insensitive to the manipulations, and intermediate-timescale areas were expected to fall somewhere between.
Memories of Part 1 Are Needed for Reliable Responses in Long-Timescale Cortical Areas during Part 2
As subjects in the NM condition had impaired comprehension of Part 2 of the movie due to never having seen Part 1, we predicted that this group would also exhibit lower ISC in long-timescale brain regions (Honey, Thompson, et al. 2012; Ames et al. 2014). We calculated ISC in the long-timescale ROI across the duration of Part 2 (5 nonoverlapping 2-min windows). Differences between groups were greatest in the first 2-min window of Part 2 (Group × Time interaction, F12,188= 2.60, P < 0.005; Supplementary Fig. 1). Specifically, the NM group had lower ISC than the other 3 groups during the 2-min window (post hoc t-tests, P < 0.005). The same pattern was observed in areas with intermediate processing timescales (P < 0.005; Supplementary Fig. 1B,C) but not in areas with short processing timescales (nonsignificant Group × Time interaction; Supplementary Fig. 1A). Differences between groups were greatest by design at the beginning of Part 2, as the groups had different stimulus histories at that point (recent, distant, or no experience of Part 1); as Part 2 unfolded over time, the same stimulus history of Part 2 accumulated across all groups, and thus differences between groups diminished.
Figure 3a shows all voxels that responded reliably within the RM and DM groups, but not in the NM group, during the first 2-min window of Part 2. Such voxels were found primarily in areas with intermediate and long processing timescales (1043 voxels fell within the medium-timescale ROI and 1292 voxels within the long-timescale ROI), but not in areas with short processing timescales (210 voxels fell within the short-timescale ROI). The results were not dependent on thresholding, as the difference between NM and the other 2 conditions was observed using a range of thresholds (Fig. 3b). For the window 0–120 s (Minutes 1–2) shown in Figure 3a,b, the P < 0.01 thresholds for each condition were as follows: RM1, R* = 0.065; RM2, R* = 0.065; DM, R* = 0.060; NM, R* = 0.062. See Supplementary Figure 2 for replication of ISC maps across the 2 RM groups. Importantly, both RM and DM groups exhibited high ISC in these regions and displayed intact comprehension of Part 2, whereas the NM group exhibited low ISC and displayed impaired comprehension due to not having experienced Part 1. The results therefore suggest that memories of Part 1 were needed to elicit reliable activity during the beginning of Part 2 in medium- and long-timescale brain regions.
Hippocampus ISC Is Higher when Memories Are from 1 Day Ago versus Minutes Ago
For the DM group, watching the beginning of Part 2 cued retrieval of events from Part 1 from 1 day prior; in the RM condition, the same events from Part 1 were experienced just a few minutes prior. Retrieval of events from 1 day prior must rely heavily on hippocampus, but when the same events occurred just a few minutes ago, is hippocampus equally recruited? To address this question, we examined hippocampal reliability during Part 2 of the movie. Voxels throughout the hippocampus were statistically reliable in all groups when ISC was calculated across the full 10 min of Part 2 (Fig. 4a and Supplementary Fig. 3), in agreement with its known role in episodic encoding and retrieval. Differences between groups were greatest during Minutes 1–2 of Part 2 (Group × Time interaction, F12,188 = 3.04, P < 0.001), echoing the previous observation that group differences in long-timescale region activity were greatest during this window. Thus, we focused on the first 2 min of Part 2.
Hippocampal ISC was significantly higher for the DM group than all other groups during Minutes 1–2; this held regardless of whether the calculation was performed according to the number of voxels exceeding ISC threshold (Fig. 4b), mean voxel-wise ISC (Fig. 4c, left), or ROI-level ISC (Fig. 4c, right) (main effect of Group, number of voxels exceeding ISC threshold: F3,47 = 4.25, P < 0.01; voxel-level, F3,47 = 4.77, P < 0.01; ROI-level ISC, F3,47 = 3.54, P < 0.05). In contrast, ISC in long-timescale cortical areas during the 1–2 min window was equally high in RM and DM conditions, with the NM group exhibiting significantly lower ISC than the other groups both at the voxel level and ROI level (Fig. 4d) (main effect of Group, voxel level: F3,47 = 9.23, P < 0.0001; ROI level, F3,47 = 5.85, P < 0.005). Hippocampus and the long-timescale ROI demonstrated significantly different response patterns (Group × ROI interaction, F3,47 = 6.18, P < 0.005).
Hippocampus Correlates more with Long-Timescale Areas when Memories Are from 1 Day Ago versus Minutes Ago
ISFC ([Simony et al. 2012], see Materials and Methods) was calculated within each group for the beginning of Part 2 (Minutes 1–2), using hippocampus as the seed region; we observed increased correlations between the hippocampus and long-timescale cortical regions for the DM condition, where day-old memories of Part 1 were needed to comprehend Part 2 of the movie, relative to the other groups (Fig. 5a). In DM, hippocampal activity was significantly correlated with voxels in the posterior cingulate, retrosplenial, and medial prefrontal cortex (the thresholds for each condition were as follows: RM1, R* = 0.25; RM2, R* = 0.26; DM, R* = 0.25; NM, R* = 0.24; see Materials and Methods for threshold statistics). On average, the hippocampal correlation with voxels in the long-timescale ROI was higher in the DM group than in the NM or RM groups (F3,47 = 5.63, P < 0.005; t-tests, P < 0.05; Figure 5b). This pattern also held when we defined ROIS for 2 DMN member regions—posterior cingulate (PCC) and medial prefrontal cortex (mPFC)—based on a resting-state atlas (Shirer et al. 2012), (Fig. 5c). In both PCC and mPFC, the correlations with the hippocampus seed were higher in DM than the other groups (PCC: F3,47 = 3.70, P < 0.05; t-tests, P < 0.05; mPFC: F3,47 = 4.21, P < 0.05; t-tests, P < 0.05). These results demonstrate that, at the beginning of Part 2, hippocampal dynamics most resembled dynamics in long-timescale cortical regions for subjects in the DM group. No voxels were significantly correlated with hippocampus in RM or NM during Minutes 1–2 (Fig. 5a). At a lower threshold (not corrected for multiple comparisons), we did observe weak correlations between the hippocampus and long-timescale cortical areas in the RM group (Supplementary Fig. 4). Note that, unlike standard within-subject functional connectivity, ISFC reveals only interregion correlation patterns that are locked to the processing of the external stimulus and are shared across subjects, that is, subject-specific connectivity is not detected. The relatively conservative nature of this method may explain why correlations between hippocampus and cortex did not exceed statistical thresholds in RM, despite prior findings of hippocampal within-subject functional connectivity during memory formation (Ranganath et al. 2005).
Neural Dynamics in Long-Timescale Cortical Regions Are Reliable but Distinct when Memories Are from 1 Day Ago versus Minutes Ago
RM and DM groups both used information from Part 1 to comprehend Part 2 but differed in how long ago they had experienced Part 1. Thus, to examine whether recent and distant memories had different influences on ongoing processing, we next asked whether cortical dynamics were similar between the 2 groups during Part 2. ISC between groups was calculated using a sliding window of 120 s to capture changes in between-group similarity over the course of the movie. To ensure unbiased comparisons, we randomly divided the RM condition into 2 groups, RM1 and RM2. The similarity between the 2 RM groups (RM1∼RM2) provides an estimated upper bound on how similar an independent group can be to the RM condition in any given time window. We compared the similarity of the DM and RM (DM∼RM2) conditions against this upper bound (RM1∼RM2).
In long-timescale regions, neural timecourses differed significantly between DM and RM at the beginning of Part 2 and remained statistically different in long-timescale regions for up to 3 min (DM∼RM2 vs. RM1∼RM2, nonparametric label-shuffling test of all windows up to the window ending at 180 s, q < 0.05; 1 additional window ending at 240 s, q < 0.1; one-tailed, FDR corrected; Fig. 6). That is, in long-timescale regions, the fact that DM subjects required information from 1 day ago was continuously reflected in neural responses for up to 3 min into Part 2. Importantly, differences between RM1∼RM2 and DM∼RM2 reflect the distinct but internally consistent dynamics elicited in the RM and DM conditions; these differences do not reflect a lack of reliable within-group responses as was seen for the NM condition. Areas with intermediate and short processing timescales did not differ significantly between groups at any time window during Part 2. The results were replicated with the alternate RM group (DM∼RM1) versus (RM1∼RM2), see Supplementary Figure 5. See Supplementary Figure 6 for different window sizes.
Searching across the entire brain, we found that the voxels most sensitive to the 1-day delay were those with the longest processing timescales. For every voxel in the brain that responded reliably during the movie, we calculated dissimilarity between DM and RM (two-tailed t-test of DM∼RM2 vs. RM1∼RM2, P < 0.05) during the beginning of Part 2 (2-min window) and mapped these values onto the brain (Fig. 7a). The voxels with the greatest dissimilarity values fell within the long-timescale ROI, including posterior medial cortex and angular gyrus. Next, we assigned each voxel a “timescale index,” a continuous version of the categorical (short/medium/long) timescale assignments (see Materials and Methods). Dissimilarity was strongly correlated with timescale index (r = 0.32, P < 0.0001, Fig. 7b).
Early Hippocampal-Cortical Coupling in DM Subjects Is Associated with Later Cortical Reliability in Long-Timescale Areas
We saw more cortical-hippocampal coupling in the DM condition than the RM condition and that long-timescale cortical response dynamics differed between the groups for a few minutes before converging. A possible interpretation of these results is that during the early minutes of Part 2, the DM group retrieved memories of Part 1, whereas the RM group had fewer demands on hippocampal retrieval. Thus, we next asked whether hippocampal-cortical interaction (memory retrieval) in a DM subject early in Part 2 would bring them into later alignment with the RM group.
We examined individual differences in how “hippocampal-cortical interaction early in Part 2” related to “cortical timecourses later in Part 2.” A significant correlation was observed between 1) hippocampal-cortical interaction (ISFC) early in the movie and 2) the level of cortical similarity between DM subjects and the RM groups later in the movie (RM1, R = 0.63, P < 0.05; RM2, R = 0.60, P < 0.05; Supplementary Fig. 7; see Supplementary Methods). Given our prior finding that DM and RM long-timescale cortical timecourses were statistically different for up to 4 min (3 min q < 0.05, 4 min q < 0.1; Fig. 6), we used Minutes 1–4 as the “early” window, and the remainder of Part 2 (Minutes 6–10) as the “late” window. No significant correlations were observed in control analyses (see Supplementary Methods and Results). While these analyses must be treated cautiously given the relatively small number of subjects per group, they provide tentative support for the idea that hippocampal-cortical interaction in DM subjects at the beginning of Part 2 predicted their cortical similarity to the RM group as the movie unfolded.
In real-life settings, hippocampal amnesics can sustain stimulus information for several minutes (for example, during a conversation). What neural systems outside the hippocampus might support this minutes-long retention of information? Previously, we identified a hierarchy of processing timescales across the cortex, ranging from short timescales (seconds) in early sensory areas up to long timescales (minutes) in high-order regions (e.g., the DMN). To investigate how long-timescale regions process stimulus information across minutes, and what interactions they might have with the hippocampus, subjects were shown a rich continuous stimulus (a movie), either continuously (“Recent Memory”, RM), with a 1-day break between Part 1 and Part 2 (“Distant Memory”, DM), or without having viewed Part 1 (“No Memory”, NM). Analyzing fMRI responses to the beginning of Part 2, we found that cortical response patterns in long-timescale areas were reliable across subjects in both the RM and DM conditions, but not in the NM condition, demonstrating that access to information from Part 1 was critical for eliciting reliable activity during Part 2. The neural data were mirrored by behavioral tests, which showed that RM and DM subjects had equally good comprehension of Part 2, whereas NM subjects had relatively poor comprehension. Both RM and DM subjects relied on memories of the same events from Part 1 to comprehend Part 2, but for DM subjects, hippocampal activity was significantly more reliable and more strongly coupled to long-timescale cortical regions, suggesting that these subjects may have had greater retrieval demands. Furthermore, cortical dynamics differed between the RM and DM conditions for several minutes at the start of Part 2 before converging toward its end, suggesting that these long-timescale regions carried different information when memories were of events from a few minutes ago versus 1 day ago. The strength of early hippocampal-cortical coupling in individual DM subjects was predictive of this later convergence.
Together, these results suggest that the demands of retrieving information from 1 day earlier strengthened coupling between the hippocampus and long-timescale cortical areas and altered cortical activity patterns for the DM group. In contrast, the RM group was able to draw on the same events from a few minutes prior with less hippocampal involvement. That is, it seemed that long-timescale cortical dynamics were influenced by memories from a few minutes prior, with less reliance on the hippocampus than when memories were many hours old. Thus, we propose that long-timescale cortical regions were able to intrinsically retain some information from at least a few minutes prior.
While the idea that information can persist in ongoing cortical activity is often explored using working memory tasks, our approach differs fundamentally from such studies. First, most working memory studies assess activity levels during a delay period, separating the contents of working memory from the processing of new input. In contrast, our study examines how information from the recent (and more distant) past interacts with online stimulus processing in the present. We do not examine how prior information is protected and carried forward in time via volitional maintenance; rather, we measure how prior information spontaneously influences present neural responses during continuous stimulation, and for how long this influence of prior information persists (Hasson et al. 2015). Second, most working memory studies maximize the difficulty of maintaining information across delays by introducing distractions and using isolated, randomized memoranda, and thus the focus tends to be on attentional control rather than memory per se. In contrast, we ask how memories of past events needed now in the present differentially affect (i.e., interact with) stimulus-driven responses using a semantically rich, dynamic, and intrinsically engaging stimulus. Some researchers have advocated a departure from models of dedicated storage buffers in working memory, arguing that the same areas that perform primary processing can also keep information active during delay periods (Ericsson and Kintsch 1995; Postle 2006; Sreenivasan et al. 2014). Our study is highly compatible with these views but differs in that it explores how past information can exert a persistent influence across time, even if the representation of the prior information is not “maintained,” in the traditional sense, via sustained delay period activity. We further demonstrate that the minutes-old and day-old memories affect response dynamics in cortical areas with intermediate and long processing timescales, but not in low-level sensory areas, which have short (hundreds of milliseconds) processing timescales (Hasson et al. 2008; Lerner et al. 2011; Honey, Thesen, et al. 2012). Finally, this work explores how neural responses underlying mnemonic processes behave in the context of real-life situations.
How might the brain retain information for many minutes in areas with long processing timescales? Recent fMRI and ECoG studies have shown that the intrinsic timescales of neural dynamics vary along a spatial gradient, with faster dynamics in areas with short processing timescales (e.g., early auditory cortex) and slower dynamics in higher-order brain regions with long processing timescales (Honey, Thesen, et al. 2012; Baria et al. 2013; Stephens et al. 2013) (e.g., the default network). This gradient of neural dynamics was observed both at rest and during the processing of real-life stories, suggesting that the intrinsic neural dynamics of a given neural circuit (i.e., how fast or slow the signal fluctuations at rest are) may be related to its processing timescale capabilities (i.e., the ability to accumulate information over short or long timescales). However, the neural mechanisms that underlie the capacity to retain information in a neural circuit for many minutes are unclear. Slow fluctuations in high-order cortical areas could be supported by recurrent circuit activity (Durstewitz et al. 2000; Brody et al. 2003) or by long timescale effects local to synapses and membranes (Marom 1998; Mongillo et al. 2008; Barak and Tsodyks 2014). Within a given cortical region, there may be neural subpopulations with differing time constants expressed in a task-dependent manner (Bernacchia et al. 2011), and interregional interactions could also constrain the timescales of brain regions (Nir et al. 2008; Leopold and Maier 2012).
What is the nature of the information represented in long-timescale brain regions? Interestingly, the regions identified as having long processing timescales overlap strongly with the DMN, including posterior medial cortex, medial prefrontal cortex, middle temporal gyrus, and angular gyrus. In this study, we hoped to explore how memory retrieval would affect activity in long-timescale areas under naturalistic conditions, as there is a body of work exploring the role of the DMN in memory retrieval paradigms (Kim 2010; Rugg and Vilberg 2013). We believe that the long-timescale properties of DMN cortical regions may arise from both 1) interaction with structures like the hippocampus that support episodic memory (Kahn et al. 2008; Buckner 2010) and 2) intrinsically slow dynamics that enable the persistence of contextual information over time (Honey, Thesen, et al. 2012; Stephens et al. 2013). How these aspects of DMN activity relate to its role in cognition is an open question, in part because the role of the DMN in cognition is multifaceted and debated in the field. These areas have been implicated in many aspects of high-level cognition, including episodic memory recollection (Rugg and Vilberg 2013), prediction error-based event segmentation (Kurby and Zacks 2008; Swallow et al. 2010), self-relevant decision making, prospective thinking (Andrews-Hanna et al. 2010), and schema knowledge (Maguire et al. 1999; Mar 2004; van Kesteren et al. 2010).
An emerging view is that these different aspects of DMN function are unified by a common theme: They require the construction and application of “situation models” (Kintsch 1988; Zwaan and Radvansky 1998), which are mental frameworks of “the relationships between entities, actions and outcomes, the gist of the spatial, temporal and causal relationships that apply within a particular context” (Hassabis and Maguire 2007; Ranganath and Ritchey 2012). In other words, these brain regions carry information about high-level structure in the world—information about places and situations that provides a schematic context within which events occur. Applying this description to the current study, subjects gather sensory information from the movie to build situation models of the story entities and events. The long-timescale capability of DMN brain regions facilitates the accumulation of information for building the model and supports its retention over minutes as the model is used to interpret new input. When faced with a previously encountered situation, prior situation models may be restored from LTM, via interaction with the hippocampus, to enable immediate comprehension of the current scene (e.g., when day-old information is revived in the DM condition).
What is the role of the hippocampus during processing of continuous real-life stimuli? The hippocampus is well established as supporting episodic encoding and retrieval (Squire and Wixted 2011), and studies using naturalistic stimuli have shown that retrieval-related hippocampal activity can be modulated by event changes just a few seconds prior (Swallow et al. 2010). It is not under dispute that the hippocampus was involved in online episodic memory encoding and retrieval while subjects watched the movie. Indeed, hippocampus responded reliably during the movie in all conditions. However, the observation of increased response reliability in the hippocampus during the DM condition (Fig. 4b,c) suggested greater demands on hippocampal processing beyond the other groups. The mere difference in delay time does not explain why RM and DM neural responses differed in the hippocampus, as direct comparisons of episodic retrieval (using arbitrary stimuli) from minutes ago versus days ago find no effect of study-test interval on BOLD activity in the hippocampus or medial temporal lobe cortex (Stark and Squire 2000). Furthermore, we found significantly enhanced correlations between the hippocampus and long-timescale cortical areas in DM at the start of Part 2 (Fig. 5) and that these hippocampal-cortical interactions early in Part 2 predicted later DMN similarity to the RM condition (Supplementary Fig. 7). Our observations suggest that as Part 2 of the movie unfolded, DM subjects retrieved more and more information from episodic memory via the hippocampus, thus reinstating prior situation models and bringing the subjects in the DM condition into closer neural alignment with the subjects in the RM group, who did not take a break from the movie.
In the current design, we chose a 1-day break duration in the DM condition in order to be confident that subjects had to rely on the hippocampus to retrieve information from Part 1. Based on prior studies of hippocampal amnesia, a break of a few minutes that includes some distraction or interference severely impairs an amnesic patient's ability to access information from an episode before that break, that is, access to the pre-break episode is hippocampus dependent. Following this logic, we expect that in our paradigm the results would be very similar for a break of a few minutes (with interference) as for a 1-day break. However, with a shorter break between Parts 1 and 2, in neurotypical subjects, there could also be gradation in this effect such that the degree of dependence on the hippocampus varies according to both 1) the duration of the delay and 2) the degree of change between the movie context and the break context. Simple context changes can certainly impact memory even over short delays; for example, Radvansky and Copeland (2006) showed that in a virtual reality environment, passing from one room to another differentially impaired memory for items left behind in the prior room just moments earlier. Thus, even with a very short break, information from Part 1 could become hippocampally dependent. Furthermore, the more the “break context” differed from the “movie context,” the more the continuity of information would be disrupted in long-timescale cortical regions, and the more hippocampus dependent access to Part 1 would become. Our current design uses a break of a full day to maximize the hippocampal dependence of access to Part 1; future studies could explore this variable by manipulating the duration and/or the degree of the context shift between Parts 1 and 2 of the movie. Future studies could also improve signal-to-noise by using larger group sizes, which could additionally strengthen inferences drawn from across-subject correlations such as the early hippocampal-cortical coupling versus later cortical reliability analysis (Supplementary Fig. 7).
In this study, we explored how long-timescale regions process stimulus information across minutes, and what interactions they have with the hippocampus when memories are from the recent past (a few minutes ago) or more distant past (a day ago). We showed subjects a realistic audiovisual movie stimulus; one group had relevant experience from several minutes prior, the other group from 1 day prior. Retrieval of events from 1 day prior must rely heavily on the hippocampus, and indeed during the DM condition, we found that hippocampal activity was correlated with activity in long-timescale cortical regions (including cortical areas within the DMN). This observation is consistent with evidence that the hippocampus and the DMN work together to support real-world memory. However, hippocampal-cortical correlations were substantially weaker when memories of the exact same events were from a few minutes prior instead of from 1 day prior, suggesting that the 1-day-break group had greater retrieval demands. Furthermore, DMN activity patterns differed between the 2 groups for several minutes before converging, suggesting that different information may have been carried in these long-timescale cortical areas. We proposed that, in the minutes-prior condition, cortical regions with long processing timescales intrinsically retained information from the previous several minutes. Together, the data suggest that, under conditions in which coherent sequences of information arrive from the world without interruption, distributed and hierarchical cortical circuits (Fuster 1997, 2000) can intrinsically retain some of the information over minutes of time.
This work was supported by the National Institutes of Health (R01-MH094480 and 2T32MH065214-11).
We thank Anna Schapiro and Yuan Chang Leong for their helpful comments on earlier versions of the manuscript. Conflict of Interest: None declared.