Upon repetition, certain stimuli induce reduced neural responses (i.e., repetition suppression), whereas others evoke stronger signals (i.e., repetition enhancement). It has been hypothesized that stimulus properties (e.g., visibility) determine the direction of the repetition effect. Here, we show that the very same stimuli can induce both repetition suppression and enhancement, whereby the only determining factor is the number of repetitions. Repeating the same, initially novel low-visible pictures of scenes for up to 5 times enhanced the blood oxygen level–dependent (BOLD) response in scene-selective areas, that is, the parahippocampal place area (PPA) and the transverse occipital sulcus (TOS), presumably reflecting the strengthening of the internal representation. Additional repetitions (6–9) resulted in progressively attenuated neural responses indicating a more efficient representation of the now familiar stimulus. Behaviorally, repetition led to increasingly faster responses and higher visibility ratings. Novel scenes induced the largest BOLD response in the PPA and also higher activity in yet another scene-selective region, the retrospenial cortex (RSC). We propose that 2 separable processes modulate activity in the PPA: one process optimizes the internal stimulus representation and involves TOS and the other differentiates between familiar and novel scenes and involves RSC.
For long, it has been assumed that presenting a stimulus repeatedly leads to reduced neural activity (i.e., repetition suppression) as neural processing becomes more efficient, for example, through a “sharpening” of the neural response (for a review, see Grill-Spector et al. 2006). This repetition suppression is proposed to be the physiological basis for both perceptual priming, that is, faster responses to repeated stimuli despite the lack of conscious recognition (James et al. 2000) and also for novelty detection, that is, automatic discrimination between novel and familiar stimuli (Ranganath and Rainer 2003). However, a growing set of data demonstrates that under certain circumstances, presenting a stimulus frequently results in stronger neural responses (i.e., repetition enhancement) than presenting it only once (Dolan et al. 1997; Grill-Spector et al. 2000; Henson et al. 2000; Kourtzi et al. 2005; James and Gauthier 2006; Turk-Browne et al. 2007). For example, Henson et al. (2000) showed that with every repetition of a formerly unfamiliar face, the blood oxygen level–dependent (BOLD) response in the fusiform face area increased—even when the face was repeated up to 5 times. Familiar faces of famous persons, on the other hand, were associated with the usual attenuation upon repetition. The authors hypothesized that neural responses are enhanced when repeated exposure of a stimulus entails additional internal operations such as establishing a new representation of a formerly unknown face. This idea implies that after familiarization with a formerly unfamiliar face (i.e., after even more repetitions of that face), the activity changes should at some point reverse from enhancement to suppression, that is, the pattern observed with the famous faces which were familiar from the beginning.
In another functional magnetic resonance imaging (fMRI) study, Turk-Browne et al. (2007) observed that when the same low-visible scene was shown twice, BOLD responses in the parahippocampal place area (PPA) were more pronounced than when the second scene was novel. However, the opposite pattern was observed for highly visible images: PPA activity was now lower when the second stimulus was a repetition of the first than when the second scene was novel. One explanation for their finding is that the second presentation of a hardly visible scene was used to further strengthen the initially weak representation of this scene (Rainer et al. 2004; Turk-Browne et al. 2008). Such strengthening was unnecessary for the highly visible scene for which a stable representation was already established after the first exposure.
The above mentioned studies found repetition enhancement only for a certain type of stimulus (low visibility, unfamiliar, degraded, or masked), whereas repetition suppression was associated with another type of stimulus (high visibility, familiar, undegraded, or unmasked). In other words, the sign of the repetition effect, according to these studies, may solely depend on the stimulus type, whereby upon repetition, one type induces enhancement and the other suppression. The aim of the present study was to test whether the same stimuli can evoke both enhanced and suppressed responses solely based on the number of repetitions, that is, stimulus quantity rather than quality. A similar idea has been put forward by Turk-Browne et al. (2008) who proposed that in an initial phase, repeated stimuli may receive enhanced processing up to a point in which they have been fully represented. After that, reduced responses may redirect processing efforts to novel stimuli. In other words, many repetitions of low-visible, initially unfamiliar scenes should lead to an inverted U-type pattern indexing internal representation shaping encompassing 2 steps: first strengthening, then sharpening.
However, novelty detection could also constitute a separate process distinguishable from representation shaping. Novelty detection is thought to be a prerequisite for the orienting response to unexpected, potentially dangerous events and has been proposed to involve brain structures like the frontal cortex and limbic structures (Ranganath and Rainer 2003; Yamaguchi 2004; Gur et al. 2007). If repetition suppression to familiar scenes in PPA was the basis for detection of novel scenes, then the latter should induce a stronger response irrespective of the representation quality of repeated scenes especially when novel events occur unexpectedly (Summerfield et al. 2008).
Here, we tested both, novelty detection and representation shaping, in a paradigm in which most of the low-visible scenes were repeated 9 times. Interspersed among the repeated scenes were occasional (unexpected) novel scenes that were only shown once. Two competing effects could be expected: The shaping hypothesis predicts that the neural response in the PPA will initially grow as low visibility stimuli are repeated and eventually decrease following an inverted U-shaped profile (for a similar proposal, see Turk-Browne et al. 2008). The novelty detection hypothesis predicts that the PPA response will always be stronger for rare novel versus repeated stimuli, irrespective of the number of repetitions or the quality of the stimuli.
In order to test these hypotheses, we assessed the BOLD response as a function of number of presentation(s) in bilateral PPA, individually mapped with an independent localizer scan. We also investigated 2 other scene-specific areas, namely, the retrospenial cortex (RSC) and the transverse occipital sulcus (TOS), as recent studies have shown that their activity is also modulated by the familiarity/novelty of a scene (Epstein et al. 2007). These analyses tested whether scene responsive areas are rather modulated according to the progress of representation shaping (i.e., inverted U-shape function) or by novelty (i.e., increased responses to novel versus all repeated stimuli).
Materials and Methods
Twenty healthy subjects (14 females, 3 left handed, mean age: 22.4 years, range: 18–38 years) volunteered to participate in the study in exchange for monetary compensation. All subjects reported normal or corrected-to-normal vision. Informed consent was obtained from all subjects, and the study was conducted in conformity with the Declaration of Helsinki and approved by the local ethics committee.
Stimuli and Procedure
All stimuli were presented using an magnetic resonance (MR)-compatible goggle system (resolution 800 × 600) with 2 organic light-emitting diode displays (MR Vision 2000; Resonance Technology, Northridge, CA), at a refresh rate of 60 Hz and located at a virtual distance of 1.2 m from the subjects. The visible screen size subtended 30° × 22.5° in the horizontal and vertical plane, respectively. Presentation software (version 10.3) was used for stimulus presentation and response collection.
Each subject underwent 3 tests inside the scanner: a threshold experiment, the main fMRI experiment, and a recognition/visibility experiment. The threshold experiment aimed at identifying the individual contrast threshold that yielded the desired performance (∼70% correct outdoor/indoor discrimination of novel scenes). In the threshold experiment, 50 masked grayscale photographs of indoor and outdoor scenes were briefly presented (50 ms) at 10 different contrast levels ranging from 89% to 98%. In total, 500 images were presented in pseudorandomized order (at least 2 images separated scenes with the same content at different contrast levels). Subsequently, the stimulus contrast that yielded 70% accuracy was used in the main fMRI study. To avoid any effect of familiarization, new masked grayscale photographs were presented in the main fMRI experiment. Here, a total of 6 experimental runs were performed. Each run contained its own set of images and comprised 10 blocks. Within the first block, 11 novel indoor/outdoor scenes were shown. From these, 10 were repeated in the following 9 blocks (5 indoor and 5 outdoor). The 11th image changed from block to block, that is, was always novel and could correspond to an indoor or outdoor scene (balanced throughout the blocks). Presentation was continuous, that is, the existence of blocks was not revealed to the subjects. Within a block (11 images), image presentation was pseudorandomized such that the position of an individual scene never changed more than 2 positions with regard to the position it had in the previous block. This restriction was introduced to avoid that within a block, one image was repeated, say, for the 5th and the other for the 6th time. Figure 1 depicts the trial configuration. Each trial lasted 2 s and consisted of a red fixation cross (200 ms), followed by a brief presentation of a single scene (50 ms), which was then replaced by a blank screen containing only a fixation cross (50 ms) followed by a checkerboard mask (50 ms). In the following 1650 ms, subjects were asked to indicate, as fast and accurately as possible, by button presses whether the stimuli corresponded to an indoor or outdoor scene. A display reminded them of the response alternatives, and after the subjects response was replaced by a fixation cross which was visible until the trial terminated. Trials were spaced by baseline periods of 0, 2, or 4 s, whereby the jittering was designed to optimize statistical efficiency, that is, the accuracy with which the event-related hemodynamic response to different stimuli could be deconvolved. A simulation procedure was performed that uses a nonlinear steepest descent approach which iteratively optimizes the estimated hemodynamic response function (HRF) and the set of weighting factors with the goal of minimizing the residual error (Dale 1999; Hinrichs et al. 2000).
To investigate whether repeating a visual scene indeed improves the internal representation of that scene, after completing the main fMRI experiment, we conducted a behavioral control test showing old and new scenes at different contrast levels. Specifically, we presented 36 scenes, which had been used during MR scanning, and 36 novel scenes, never seen before. The 36 old images were randomly chosen for each subject with the constraint that the images came from each of the 6 runs (3 indoor/3 outdoor scenes per run). Each scene was presented in 5 contrast levels: the individually determined contrast level used during scanning (reference level) and 2 contrast levels below and above the reference level. Thus, if during scanning, for example, the contrast level was set to 91%; in the posttest contrast levels ranged between 89% and 93%. In total, 180 images of old and 180 images of new scenes were shown. In each trial, subjects performed first a discrimination task (outdoor/indoor judgment) and then rated how clear their perception of the images was on a 4-point scale: 1 (completely invisible), 2 (brief glimpse, but could not recognize what it was), 3 (almost clear perception), 4 (clear perception). This scale aimed at investigating the visibility of the stimuli and not confidence. Hence, not only discrimination performance but also the subjective quality of the stimuli was assessed (for a similar procedure, see Melloni et al. 2011). We predicted that if repetition of the low-visible images increases the stability of their neural representations, then higher visibility ratings can be expected at yet lower contrast levels because a robust neural representation helps to stabilize percepts against shifts of low-level stimulus parameters such as contrast levels (Kleinschmidt et al. 2002).
Each subject also participated in a localizer experiment, in which we presented alternating blocks of novel scenes with blocks of faces, objects, scrambled objects, and bodies. Each block lasted 10 s and was repeated 4 times. The contrast between places and faces was used as this yielded the most robust results.
Functional and anatomical magnetic resonance imaging data were acquired with a 3-T Siemens (Erlangen, Germany) Magnetom Allegra scanner at the Brain Imaging Center in Frankfurt/Main, Germany. In all functional scans, a T2*-weighted gradient-recalled echo-planar imaging sequence was used (34 slices; repetition time [TR], 2000 ms; echo time [TE], 30 ms; field of view, 192 mm; in-plane resolution, 3 × 3 mm; slice thickness, 3 mm; gap thickness, 0.3 mm). For detailed anatomical imaging, in each subject, a T1-weighted magnetization prepared rapid acquisition gradient echo (MP-RAGE) sequence was collected (TR, 2300 ms; TE, 3.49 ms; flip angle, 12°; matrix, 256 × 256; voxel size, 1.0 × 1.0 × 1.0 mm). Neuroimaging data were analyzed using the BrainVoyager QX (Brain Innovation, Maastricht, the Netherlands) software package. The first 4 volumes of each experimental run were discarded to preclude T1 saturation effects. Preprocessing of the functional data included the following steps: 1) 3-dimensional motion correction, 2) linear trend removal and temporal high-pass filtering at 0.0054 Hz, 3) slice-scan-time correction with sinc interpolation, and (4) spatial smoothing using a Gaussian filter of 4 mm (full width at half maximum). Functional and anatomical data were coregistered, brought into Anterior Commissure-Posterior Commissure space using cubic spline interpolation, and then transformed into standard Talairach space using trilinear interpolation.
Region of Interest Analysis
Individual PPA regions of interest (ROIs) were functionally localized bilaterally based on the independent localizer scan. Blocks of faces and scenes were separately modeled with canonical HRFs used as regressors in a multiple regression analysis. A linear contrast of the scene blocks versus the face blocks created a statistical parametric map of t values with a strict threshold (P < 0.001, corrected for familywise error rate, cluster threshold of 5 voxels). The maximally scene-selective voxel in the PPA was used as the center of a cluster comprising the ∼200 surrounding most active voxels (interpolated voxel size 1 × 1 × 1 mm3) in each hemisphere (Epstein et al. 2003; Turk-Browne et al. 2007; Yi and Chun 2005). A typical subject's localizer result and ROIs are presented in Figure 4.
The transverse occipital sulcus (TOS) and RSC were defined based on a group analysis of the localizer scan data with the contrast places versus faces. The group approach was chosen as TOS, and RSC could not reliably be identified in individual brains. Otherwise, the cluster selection criteria were identical as those for the PPA and the clusters served as ROIs for event-related analyses of the main experiment.
For each subject and each ROI, a deconvolution analysis (normalized to percentage signal change and corrected for serial correlation) on the basis of a general linear model was performed in order to estimate the HRF for each trial. Ten stick predictors (one predictor per volume) were defined to cover the temporal extent of a typical hemodynamic response (20 s). Beta values of time courses of activation were extracted for each experimental condition in each subject in each ROI. The observed peaks (bins 3 to 4) from each subject's HRF for PPA, TOS, and RSC were averaged per condition and analyzed with analyses of variance (ANOVAs).
Both behavioral and fMRI-ROI data were submitted to a repeated-measure ANOVA with the factor repetition. To avoid power problems inherent with a 10-level factor design, we grouped the factor repetition into 4 levels: new, early, middle, and late. New refers to the novel (interspersed) scenes, early corresponds to the mean of the first 3 repetitions (1–3), middle to the mean of repetitions (4–6), and late to the mean of repetitions (7–9). Results were Greenhouse–Geisser corrected where appropriate.
Our design involved a possible confound between repetition and time, that is, the number of repetitions was always low at the beginning and high at the end of an experimental run, so that modulations attributed to repetition might indeed have stemmed from other time-dependent but unspecific factors like variations in arousal or scanner noise. In order to rule out this possible confound, we ran 2 additional analyses. First, we compared the activation for the novel images presented in the first block with that for novel images presented later in the runs. If the levels of activity were similar, this would argue against an unspecific time-dependent confound because such a confounding process would have affected the novel events at the beginning differently from those presented later. Second, we ran an ANOVA in which we looked for an interaction time (early, middle, late) × stimulus type (novel vs. repeated). A significant interaction would indicate that only one stimulus type was modulated by time again arguing against unspecific time-dependent processes, which should modulate responses to both types of stimuli in the same way. Because the ratio of novel versus repeated scenes was rather unbalanced in the repetition blocks (1 novel:10 repeated) for this analysis, we downsampled the number of repeated scenes by randomly selecting one repeated scene per block. Hence, the number of repeated trials that entered the analysis matched that of novel trials.
As can be seen from Figure 2, during fMRI, reaction times (RTs) were slowest for novel scenes (673 ms) and became increasingly faster as a function of the number of repetitions: 665 ms for early repetitions, 655 ms for middle, and 646 ms for late repetitions (F2.86,54.39 = 3.54, P = 0.022, ϵ = 0.95). Accuracy was higher for repeated than for novel scenes (69% vs. 65%, t = 3.23, P = 0.004), however, did not improve with the number of repetitions (P > 0.1). RTs to the novel first block images were not different from those to novel images presented later in the run (P = 0.7).
The postscan control experiment showed that both, contrast level (F1.6,30.85 = 67.20, P < 0.001, ϵ = 0.65) and old/new (F1,19 = 10.56, P < 0.004), had a significant effect on accuracy with more correct responses for high contrast and old images. In addition, old scenes were rated as more visible than new ones (1.84 vs. 1.74, F1,19 = 10.56, P = 0.004), and as expected, visibility ratings decreased as a function of contrast level (F1.62,30.84 = 67.2, P < 0.001, ϵ = 0.41) (Fig. 3). Importantly, in line with the prediction that previous exposure will improve the quality of the internal representation, we observed an interaction contrast level × old/new (F1,19 = 9.46, P = 0.04) indicating an advantage of old over new scenes especially at lower contrast levels. At low contrast, image quality is more degraded and thus previous exposure is particularly relevant to boost visibility levels. Hence, repetition of low-visible scenes led to more robust neural representations. These helped to stabilize perception in face of changes in bottom-up stimulus attributes, such as contrast levels.
fMRI-Parahippocampal Place Area
In all subjects, PPA could be easily identified bilaterally from the localizer scans. Mean Talairach coordinates were 23 (range: 19–25), −41 (range: −37 to −45), −8 (range: −6 to −10) for the right and −25 (range: −22 to −28), −42 (range: −38 to −46), −8 (range: −5 to −11) for the left PPA (for an example, see Fig. 4). As can be seen in Figure 5, while BOLD responses in PPA increased from repetition 1 to 5, further repetitions led to more and more attenuated responses (4-level factor repetition ANOVA, F1.99,37.94 = 4.60, P = 0.016, ϵ = 0.66). Planned contrasts confirmed this observation as middle repetitions had stronger BOLD responses than early (P = 0.04) and late repetitions (P = 0.006). The BOLD response to novel images was higher than to any repeated image, which was confirmed by planned contrasts comparing new and early (P = 0.015), new and middle (P = 0.043), and new and late (P = 0.018) repetitions, respectively (Fig. 6). Note that all reported results could be replicated when instead of novel images from later blocks, the novel images from the first block were entered into the analysis.
An additional 3-level ANOVA that only included the repeated blocks (early, middle, late) was significant with F1.81,34.34 = 4.11, P = 0.028, ϵ = 0.90. The tests of within-subjects contrasts revealed a quadratic pattern (F1,19 = 10.80, P = 0.004) indicating that repeated scenes in the middle of an experimental run yielded larger responses than those at the beginning or at the end of the run.
Relation between PPA Activity and Visibility Rating
In order to more directly evaluate whether the strength of the PPA BOLD response to a presented scene relates to the quality of the internal representation of that scene, we ran a comparison between PPA BOLD response and the visibility rating in the postscanning experiment. To do so, a median split was performed between old scenes that received high (3, 4) versus low (1, 2) visibility ratings. We then tested whether the BOLD response in the PPA during the middle phase of the scanning experiment could predict later visibility, that is, differed between scenes that were later reported to be well visible versus poorly visible. The respective t-test revealed that the well visible yielded larger BOLD responses (mean β-value 0.32) than poorly visible scenes (mean 0.24), suggestive of a stronger neural representation (t = 3.65, P < 0.002, Fig. 7). No association with later visibility ratings were observed for the early and late repetition phases (P > 0.1).
Time- versus Repetition-Dependent Effects
Finally, we investigated whether time-dependent factors (like changes in arousal or scanner noise) could underlie the modulations observed for repeated events. While a specific modulation attributed to representation sharpening should exclusively emerge for repeated images, unspecific time-dependent effects should be observed both for repeated and for novel images. To test these possibilities, we first compared the responses with the interspersed novel scenes from blocks 2 to 10 to those elicited by the scenes of the 1st block in which all scenes were still novel. The respective t-test was not significant (t = 0.33, P = 0.75) as means were comparable (0.33 for novels in the first block [standard error, SE = 0.055], 0.31 for the interspersed novel events [SE = 0.046]). Next, we computed an ANOVA with the factors time (early, middle, late) and stimulus type (novel vs. repeated). We found an interaction time × stimulus type with F1.69,32.14 = 4.12, P = 0.031, ϵ = 0.846 but no main effects. Together, these analyses indicate that only responses to repeated items were modulated over time, whereas novel images for which no memory trace had yet been formed, elicited the same large neural response irrespective of their position within an experimental run (first block vs. interspersed).
fMRI-Transverse Occipital Sulcus/Retrospenial Cortex
On the group level, additional scene-specific areas could be identified from the localizer scan: area TOS (right: 31, −80, 24; left: −37, −87, 23) and RSC (right: 7, −56, 12; left: −10, −54, 13). With these clusters, additional ROI analyses were run in which the individual BOLD responses during the repetition experiment were assessed (see Fig. 4).
The ANOVA with the 4-level factor repetition (new, early, middle, late) was significant both for area TOS (F1.97,38.93 = 4.53, P = 0.02, ϵ = 0.68) and RSC (F2.01,37.31 = 4.78, P = 0.017, ϵ = 0.63). We then ran t-tests between novel versus repeated trials in order to assess novelty detection and between middle versus early and late phases in order to assess representation shaping, respectively. In area RSC, we only found an effect of novelty (t = 2.81 P = 0.04), whereby novel events induced stronger response (mean 0.12) than repeated events (mean 0.08). In contrast, area TOS only showed an effect of repetition, whereby the middle phase showed stronger responses (mean 0.24) than the early (mean 0.19, t = 3.1, P = 0.03) and the late phases (mean 0.19, t = 3.0, P = 0.03), respectively.
By repeatedly presenting low-contrast scenes, we observed that the number of repetitions modulated BOLD responses in PPA: activation increased from the first to the fifth repetition and then began to decrease with further repetitions, displaying an inverted U-shape over the number of repetitions. A similar pattern of activation was also observed in TOS, another scene-specific brain region. In addition, we found that novel scenes elicited a stronger BOLD response in the PPA and RSC than repeated scenes. To anticipate our conclusions: we propose that these findings can only be explained if one assumes that activity in scene responsive areas (PPA, TOS, and RSC) is modulated by 2 different processes—representation shaping aimed at improving the quality of stimulus representations and novelty detection aimed at orienting attention to novel unexpected events.
The U-shaped response from the first to the last repetition is in line with the hypothesis that repetitions of degraded low-visible stimuli are first utilized to improve the quality of the initial weak representation in higher sensory areas; only after this has been accomplished further repetitions yield the usual attenuation of neural responses (Rainer et al. 2004; Turk-Browne et al. 2008).
Henson et al. (2000) have also reported increasing BOLD responses in the fusiform face area up to the fifth presentation of unfamiliar faces (for a similar finding in object specific areas, see also Grill-Spector et al. 2000). As no further repetitions were used in that study, it can only be speculated whether the signal had started to drop would even more repetitions have been applied. Here, we show that from the sixth repetition, the BOLD signal in the PPA began to decline. We take this as evidence that subjects had become familiar with the initially unknown scenes as in the study by Henson et al. (2000), repetition of familiar faces was also associated with an attenuation of the BOLD signal. In other words, familiarization and improvement of the internal representation may be parallel processes. The later signal drop presumably reflects that stimulus processing has become more efficient, for example, through sharpening or tuning of the neural response (Desimone 1996).
Area TOS, another scene-specific area, also showed the proposed inverted U-shaped activation pattern. This finding is in line with an earlier study that demonstrated that activity in TOS is modulated by familiarity (Epstein et al. 2007). Therefore, with respect to repetition-dependent representation, shaping TOS seems to be modulated similarly to PPA supporting the idea of a local network for optimizing scene representations.
An alternative explanation for the observed inverted U-shaped activity pattern could be variations in arousal or any other time course dependent but otherwise unspecific factor. In order to assess this confound, we have performed analyses regarding the time-dependent modulations of neural responses to novel items. As no such effects were observed, unspecific confounds are an unlikely explanation for the observed repetition effects.
RTs became increasingly faster as the fMRI run progressed. The increasing RT speed during the fMRI experiment was paralleled by behavior during the postscanning experiment. Here, it could be shown that perception of repeated scenes was subjectively enriched and that scenes at the lower contrast levels were rated as more visible when they had previously been shown during the main experiment. Moreover, scenes that were rated as more visible were associated with larger BOLD responses during the middle phase of the preceding fMRI runs. The latter indicates that enhanced BOLD responses to degraded repeated stimuli indexes the formation of a stronger, more stable internal representation (i.e., better quality) allowing for better judgment when images become even more degraded (Kleinschmidt et al. 2002).
We also found a dissociation between the time course pattern of the physiological (inverted U) and the behavioral data (linear RT decline, no changes in accuracy), which argues against the idea that activity in scene-specific areas merely reflects behavioral performance (Grill-Spector et al. 2000). In that study, Grill-Spector et al. (2000) found that activity in object-specific areas continuously increased during perceptual training and was accompanied by improved identification of their masked objects. In the present study, a somewhat different pattern emerged, as we did not observe changes in discrimination performance but a steady decrease in RT as a function of repetition. A similar dissociation between RTs and fMRI suppression effects was reported by Sayres and Grill-Spector (2006). Note, however, that comparisons between these studies and ours have to be made with care, as repetitions might have different effects in scene as opposed to object-specific areas.
If representation shaping was the only process that drives the PPA's BOLD response, then one would have expected a larger response compared with the first presentation already for the second time a stimulus is presented, that is, with the first repetition. However, the signal in the first repetition block in our study was lower than for novel scenes. This finding—at first sight—also contradicts the results of the study of Turk-Browne et al. (2007) who reported that a low-visible scene that is followed by a novel low-visible scene induces a weaker response than when the very same scene is repeated. In other words, in their study, the second presentation can be assumed to induce a smaller signal than the first. In order to reconcile these apparent discrepancies, one needs to take a closer look at the specific differences in experimental designs which can explain why the first representation of a formerly novel scene constituted a special case in our study, so that the inverted U-shape was only observed from the “second” presentation (first repetition) onward.
Turk-Browne et al. (2007) compared stimuli pairs, whereas we used a continuous presentation paradigm where only 1 of 11 stimuli was novel whilst the majority of stimuli were repeated over and over again. Our design, therefore, resembles that of classical oddball tasks where infrequent novel and target stimuli are interspersed among frequent standard stimuli (e.g., Stoppel et al. 2009; Axmacher et al. 2010). Taken into the MR scanner, such oddballs tasks routinely find an extended neuronal network which is activated by novel events and which involves frontal and limbic structures including the hippocampus (Yamaguchi 2004; Gur et al. 2007). The idea that repetition effects are strongly modulated by specific task parameters like the frequency with which a given stimulus is presented (and hence is expected) was also formulated by Summerfield et al. (2008). The authors could show that less expected stimuli induce stronger BOLD responses in a sensory brain region than highly expected frequent stimuli. Sayres and Grill-Spector (2006) also found that a repeated stimulus yields a stronger signal (less suppression) when many other stimuli intervene than when the repetition occurs shortly after the initial presentation and, therefore, is more expected.
A significant activity difference between novel and repeated scenes in the present study also emerged in the scene-specific area RSC, supporting the idea that this area participates in novelty detection. Interestingly, Epstein et al. (2007) also observed that novelty modulated RSC activity; they even reported that activity in no other scene-specific area was influenced by novelty as much as in RSC.
To conclude, the observation that beyond PPA 2 different brain areas (TOS vs. RSC) are involved in representation shaping and novelty detection, respectively, supports the idea that the 2 constitute separable processes that rely on different local circuitry, that is, interaction among different brain areas—although of course, the fact that 2 regions show similar activation patterns does not necessarily mean that they interact. Furthermore, it needs to be investigated how both processes are modulated by top-down control, as it has been shown that repetition effects are attention dependent (Eger et al. 2004). Future studies on effects of multiple repetitions will need to manipulate both bottom-up (e.g., visibility) and top-down (e.g., task difficulty) factors to shed more light on this issue.
German Research Foundation (DFG, Mu 1364/3 to N.G.M.), Federal Ministry of Education and Research (BMBF, to N.G.M.), and the Max Planck Society.
We would like to thank Lisa Koch and Sandra Anti for their help in data acquisition, Ralf Deichmann for his help with the fMRI sequences, and Caspar Schwiedrzik for helpful comments on earlier versions of this manuscript. Conflict of Interest: None declared.