Habituation is a fundamental form of learning manifested by a decrement of neuronal responses to repeated sensory stimulation. In addition, habituation is also known to occur on the behavioral level, manifested by reduced emotional reactions to repeatedly presented affective stimuli. It is, however, not clear which brain areas show a decline in activity during repeated sensory stimulation on the same time scale as reduced valence and arousal experience and whether these areas can be delineated from other brain areas with habituation effects on faster or slower time scales. These questions were addressed using functional magnetic resonance imaging acquired during repeated stimulation with piano melodies. The magnitude of functional responses in the laterobasal amygdala and in related cortical areas and that of valence and arousal ratings, given after each music presentation, declined in parallel over the experiment. In contrast to this long-term habituation (43 min), short-term decreases occurring within seconds were found in the primary auditory cortex. Sustained responses that remained throughout the whole investigated time period were detected in the ventrolateral prefrontal cortex extending to the dorsal part of the anterior insular cortex. These findings identify an amygdalocortical network that forms the potential basis of affective habituation in humans.
Habituation is a fundamental form of learning manifested by a stimulus-specific decrement of neuronal responses, that is, to repeated sensory stimulation with the same or a similar stimulus (Thompson and Spencer 1966; Dudai 2002). Habituation may serve the functional purpose of protecting the organism from flooding with irrelevant sensory information by allocating resources to new salient stimuli in the environment (Siddle 1991). Repetition-related reductions in neural activity have previously been reported at multiple time scales ranging from milliseconds (Weiland et al. 2008), seconds (Fischer et al. 2003), minutes (Whalen et al. 1998) to weeks (Johnstone et al. 2005). A great interest in habituation in the human brain in general and in the human amygdala—a key structure of affective processing—in particular lies in the potential importance of disturbed habituation processes for the pathophysiology of psychiatric disorders such as posttraumatic stress disorder (Protopopescu et al. 2005; Shin et al. 2005). Habituation especially to affective stimuli has also been extensively researched on the behavioral level in the framework of valence and arousal as 2 fundamental dimensions of emotional experience (Russell 1980): repeated exposure to pleasurable stimuli over a period of approximately 5 min caused them to be judged as less emotionally valenced (Leventhal et al. 2007) and subjective valence ratings habituated for pleasant but not for neutral sounds during repeated exposure close to 30 min (Martin-Soelch et al. 2006). Also addressing habituation of arousal, unpleasant pictures were found to be rated as less unpleasant, and both pleasant and unpleasant pictures as less arousing after repetitions within approximately half an hour (Codispoti et al. 2006). In summary, these studies indicate a robust effect of affective habituation that can be parameterized by decreased valence and arousal ratings to emotional stimuli with repetitions.
Both animal studies (Sawa and Delgado 1963; Herry et al. 2007) and functional neuroimaging studies in humans suggest that the amygdala plays an important role in habituation processes (Breiter et al. 1996; Whalen et al. 1998; Phillips et al. 2001; Wright et al. 2001; Herry et al. 2007) and in the coding of valence and arousal information (Anderson et al. 2003; Small et al. 2003; Winston et al. 2005). The amygdala is not a homogenous structure but composed of over 10 subnuclei (Amunts et al. 2005). Recently, using functional magnetic resonance imaging (fMRI) combined with a probabilistic atlas system (Amunts et al. 2005), it was shown that during auditory perception positive fMRI responses predominate in the probabilistic defined laterobasal amygdala subregion (Ball et al. 2007).
In contrast to previous studies, the objectives of the present study were to study the temporal dynamics of habituation in parallel both on the behavioral and neuronal level, both in the amygdala and its probabilistically defined subnuclei, and in related cortical areas. To this aim, brain responses during processing of repeatedly presented, similar auditory stimuli were measured using blood oxygen level–dependent (BOLD) fMRI. Subjects gave ratings of their valence and arousal experience after each individual stimulus. This approach allowed us to delineate the network of brain areas showing a decline in activity that occurs on similar time scales as reduced valence and arousal experience. Furthermore, the present study aimed to differentiate this network from other brain regions showing habituation responses on different time scales (e.g., for the auditory cortex, see Seifritz et al. 2002).
To achieve these objectives, we analyzed brain responses to auditory stimuli in order to identify brain areas showing long-term decreases of the BOLD signal (habituation), increase of the BOLD signal (sensitization), and sustained responses, that is, responses showing neither habituation nor sensitization, all within the whole investigated period of time (43 min). Furthermore, short-term decreases within the duration of the individual stimuli (i.e., within 24 s) were analyzed.
As auditory stimuli, piano melodies in 4 variations of harmony and tempo (consonant-slow, consonant-fast, dissonant-slow, and dissonant-fast) were presented to healthy subjects without professional musical education during acquisition of BOLD fMRI. The variations of harmony and tempo were applied to induce different degrees of perceived valence and arousal. Based on the existent literature as summarized above, we hypothesized that, in particular, the amygdala might show reduced fMRI responses in parallel to habituation of valence and arousal ratings.
Materials and Methods
Twenty right-handed healthy volunteers participated in the study. One participant was excluded from data analysis due to excessive head movements during scanning, resulting in a final sample of 19 subjects (11 females, 8 males, mean age = 22.74 years, range = 19–34 years). All subjects received regular musical education at school, which comprised singing and the acquisition of elementary theoretical knowledge about music but no professional musical training. Handedness and lifetime music education were assessed before the fMRI experiment with questionnaires (Oldfield 1971; Litle and Zuckerman 1985); the findings are summarized in the section “Ratings and Questionnaire Data.” All participants had normal vision, no history of psychiatric or neurological diseases, and no hearing impairments. Only subjects free of any medication were included. The study was approved by the ethics committee of the University of Freiburg, Germany. Before participation, subjects signed written informed consent. The participants received a modest monetary compensation for participation.
For auditory stimulation, 40 piano pieces, each of 24-s duration, were presented during fMRI data acquisition. To create the 40 piano pieces, 10 melodies were selected. All melodies were taken from the major–minor tonal music, and the mean tempo was for all melodies 120 beats per minute, determined using the MIDI toolbox by Toiviainen and Eerola (the MIDI toolbox can be freely accessed through: https://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/miditoolbox/).
The tempo and the harmony of the 10 selected piano melodies (see Table 1 in the Supplementary Material) were varied, creating 4 versions of each tune: consonant-fast, consonant-slow, dissonant-fast, and dissonant-slow. Tempi for the fast and slow versions were 156 and 84 beats per minute, respectively. These values were selected because they represented fastest and slowest tempi, respectively, that still sounded natural to one professional musician and 5 nonmusicians. The consonant stimuli were the original tunes, whereas the dissonant stimuli were electronically manipulated counterparts of the original tunes created by shifting the melody of the original excerpt but not the accompanying chords, by half tone below the original pitch. Thus, the dissonant and consonant and the fast and slow versions of a tune had the same rhythmic structure. All stimuli were processed using Cubase VST/32 R.5 (Steinberg Media Technologies GmbH, Hamburg, Germany). The variations of harmony and tempo were applied to induce different degrees of perceived valence and arousal. Finally, all sound files were transformed into wave files for stimulation in the scanner using WaveLab 4.0 (Steinberg, Media Technologies GmbH Hamburg). Importantly, the average sound pressure level of all music pieces was equalized using Steinberg WaveLab 4.0.
Moreover, to investigate the network of brain areas underlying habituation during the perception of auditory stimuli, we had the aim to select unknown piano melodies to avoid confounding differences in the level of familiarity. We assessed familiarity by subjects’ ratings after scanning (ranging from 0 = not familiar to 3 = familiar). The mean value for familiarity ratings of the pleasant stimuli was 0.79 (±0.86), and the corresponding value for the unpleasant melodies was 0.74 (±0.81, see also “Results section”), showing that the selected musical pieces were relatively unfamiliar to the subjects.
Experimental Setup and Procedure
Subjects were asked to complete the state anxiety inventory (STAI-S) in the scanner by using an magnetic resonance (MR) imaging compatible mouse (Laux et al. 1981). This questionnaire was applied to evaluate the subjects’ anxiety level. The mouse allowed subjects to move a white box to the left or right along a visually presented scale by pressing the corresponding mouse buttons with the right hand. During this procedure, the same echo planar imaging (EPI) sequences were run as also later during the fMRI experiment to generate the same level of scanner noise, as scanner noise is one factor likely to contribute to subjects’ anxiety in the scanner.
Afterward, the actual fMRI experiment was conducted in a “block design.” The 40 piano melodies were presented in a random order, using in-house developed presentation software, via MR-compatible headphones (NordicNeuroLab, Bergen, Norway). Subjects viewed a fixation cross during the experiment and were instructed to listen attentively to the music and to avoid any overt movement. Each melody was preceded by a written instruction presented on the screen (“music starts”), and each melody lasted 24 s. An evaluation period followed each melody presentation. Within this period, participants were asked to rate the preceding melody on a 7-point self-assessment bipolar scale along the dimensions valence (ranging from −3 = very unpleasant to 3 = very pleasant) and arousal (ranging from −3 = very calming to 3 = very arousing). Each of the 2 rating periods lasted 6 s. Subject communicated their decisions by using a scanner-compatible mouse that allowed them to move a white box on the visually presented scale leftwards or rightwards by pressing the corresponding mouse button with their right hand. The evaluation was followed by a resting period of 11-s duration. The experiment had 40 runs (each consisting of melody presentation, evaluation, and rest), and the total scanning time for the experiment was 43 min. The experimental procedure is summarized in Figure 1 in the Supplementary Material.
Functional and structural images were acquired on a 3-T scanner (Siemens Magnetom Trio, Erlangen, Germany). Image acquisition started with localizing the brain, a reference scan for the distortion correction, and the anatomical scans of the brain that were obtained using a magnetization-prepared rapid-acquired gradient echoes (MPRAGE) sequence of 7-min duration. Subsequently, the fMRI experiment was conducted. Structural T1-weighted images were obtained using a MPRAGE sequence (resolution: 1-mm isotropic, matrix: 256 × 256 × 160, time repetition [TR]: 2200 ms, time to inversion: 1000 ms, 12° flip angle). Functional images were obtained using a multislice gradient EPI method. Each volume consisted of 44 sagittal slices (resolution: 3-mm slice thickness, matrix: 64 × 64, field of view: 192 × 192 mm, volume TR: 3000 ms, time echo: 30 ms, 90° flip angle). The sagittal slice orientation resulted in significantly lower acoustic noise generated by the imaging gradients, enabling a better auditory perception. In addition, this orientation in combination with the slice thickness of 3 mm reduced the signal loss in the amygdala region to give more reliable detection of responses. An accurate registration of the functional and structural images was enabled by correction of the functional image data for geometric distortions (Zaitsev et al. 2004). The distortion field was derived from the local point spread function in each voxel as determined in a 1-min reference scan. Prior to distortion correction, the data were motion corrected by image realignment with the reference scan. A representative example of EPI data after distortion correction is shown in Ball et al. (2007) demonstrating good EPI signal quality in the amygdala region.
Preprocessing and Statistical fMRI Analysis
Motion and distortion correction were performed online during the reconstruction process (see above). Preprocessing consisted of normalization and smoothing. All functional images were normalized into standard stereotaxic space of the Montreal Neurological Institute (MNI) template. Subsequently, the images were smoothed using a 6-mm full width at half maximum Gaussian kernel to minimize the effects of individual variations in anatomy and to improve the signal-to-noise ratio. The timing information of the piano melodies and the evaluation periods was each modeled with a boxcar function convolved with a canonical hemodynamic response function. A high-pass filter with a cutoff of 1/128 Hz was applied before parameter estimation. We performed preprocessing and data analysis using SPM5 (Wellcome Department of Cognitive Neurology, London, UK). On the single-subject level, contrast images of music perception > baseline (i.e., time periods during which subjects passively viewed the fixation cross without stimulus presentation, for further details regarding the experiment, see also Fig. 1 in the Supplementary Material) were calculated for all subjects. Fast habituation was analyzed on the single-subject level by comparing the first half with the second half of the music presentation (i.e., the first 12 s with the second 12 s). Long-term habituation and sensitization were analyzed using a regressor modeling linearly changes in music-related response amplitudes over the whole time of the experiment (43 min). Moreover, differential habituation patterns in response to the 4 music conditions (consonant-slow, consonant-fast, dissonant-slow, and dissonant-fast) were analyzed on the single-subject level by comparing fast with slow piano melodies (collapsing consonant and dissonant melodies) and consonant with dissonant melodies (collapsing fast and slow melodies).
Subsequently, group-level statistics were computed to reveal significant responses over the total group of the 19 subjects in the contrast music perception > baseline. The results are reported for the t-test at P < 0.05, false discovery rate (FDR) corrected, cluster size > 10 voxels. For anatomical assignments, we used probabilistic anatomical maps (Toga et al. 2006). The results of the fast and long-term habituation were analyzed using a region of interest (ROI) model defined by the amygdala taken from the probabilistic anatomical map to the human amygdala (Amunts et al. 2005) and the activated regions in the contrast music perception > baseline. The results of the habituation analyses are reported at P < 0.005 level (t-test, uncorrected, cluster size > 50 voxels). To test for differences in the speed of long-term habituation, in particular, in the right versus left amygdala, correlation coefficients of BOLD response amplitude with time were determined at both the right and left amygdala long-term habituation peaks (for further details, see Results Section) for each subject, and correlation coefficients from the right versus left amygdala were then compared using a sign test. In addition, sensitization was analyzed by assessing the increase in the BOLD signal over the whole time of the experiment at P < 0.05, FDR-corrected, cluster size > 10 voxels for whole-brain analysis and for the ROI (amygdala + regions activated in the music > baseline contrast) at P < 0.005 (t-test, uncorrected, cluster size > 50 voxels).
Furthermore, to assess brain areas showing sustained responses, that is, showing neither long-term (over the experiment’s 43 min-duration) nor short-term habituation effects, we masked the results of the contrasts music perception > baseline with regions showing no significant effect neither in the short-term nor in the long-term habituation contrasts at P > 0.3. Results are thus reported at the P > 0.3 level (t-test, uncorrected, cluster size > 50 voxels). Finally, to analyze the impact of subjects’ anxiety level and music education on BOLD effects, the scores of the questionnaires STAI-S and music education were correlated with the contrast music versus baseline, habituation (short- and long-term), sensitization, and sustained responses.
Analysis of Ratings and Questionnaire Data
Normal distribution of the questionnaire data (STAI-S and maximum lifetime music education) and subjects’ ratings (the valence and arousal ratings which were given by the subjects in the scanner after each stimulus presentation and the global familiarity ratings after scanning) were tested with a Kolmogorov–Smirnov test and analyzed using SPSS (Version 11.0, Mac OS X Version) and Matlab (Version 7.0.4, MathWorks, Natick, MA). Ratings of valence and arousal were analyzed for habituation effects by calculating the correlation coefficients between the time course of the experiment and the magnitude of both valence and arousal ratings. For this analysis, the scores of the 2 bipolar dimensions valence (ranging from −3 = very unpleasant to 3 = very pleasant) and arousal (ranging from −3 = very calming to 3 = very arousing) were transformed into absolute scores (ranging from 0 to 3, with 0 corresponding to neutral), and subsequently, correlations of these values with time were tested using Spearman’s rank correlation coefficient.
Ratings and Questionnaire Data
The ratings of valence, arousal, and the global familiarity rating as well as the questionnaire data (STAI-S and music education) were Gaussian distributed (Kolmogorov–Smirnov test, P < 0.05). Total scores for the STAI-S of the 19 subjects of the fMRI study ranged from 29 to 61, with a mean of 38.05 and standard deviation of ± 8.5. Maximum lifetime music education had a mean of 7.97 ± 4.65 years. All subjects were right-handed according to the Edinburgh handedness questionnaire (Oldfield 1971): mean = 84.95%, range = 75–100%. After scanning, subjects were asked to rate globally the familiarity (ranging from 0 = not familiar to 3 = familiar) of the pleasant and for the unpleasant melodies. Familiarity ratings for the pleasant and unpleasant melodies were identical in all cases. The mean familiarity ratings were 0.74 (± 0.81), indicating that musical pieces were relatively unfamiliar to the subjects. The magnitude both of valence and arousal ratings showed a significant negative correlation with time over the course of the experiment (P = 0.0086 for valence and P = 0.0053 for arousal ratings). The time course of the magnitude of valence and arousal ratings is illustrated in Figure 1, showing a clear decline of magnitude in both cases. Figure 1 shows that valence ratings declined within the first 20 min, whereas arousal ratings decreased over the whole time of the experiment (43 min).
Functional Imaging Data
Figure 2 shows the brain areas activated in the 19 right-handed subjects during the perception of piano melodies. Significant responses of voxels at P < 0.05 (FDR-corrected, cluster size > 10 voxels) for the contrast music perception > baseline were located in the right and left primary and secondary auditory cortex. Responses were also found in the insular cortex (left) and in the region of the temporal poles (bilateral). In addition, increased BOLD signal was found in the inferior frontal cortex including Broca’s area and the right hemisphere homologue to Broca's area (area 44/45) extending to the right and left ventrolateral prefrontal cortex, the supplementary motor area (SMA), and in the cerebellum (bilateral). Responses were also found in the right and left amygdala, in the right hippocampus, and in the left caudate nucleus. Brain areas with significant responses are summarized in Supplementary Table 2.
Significant long-term habituation occurring within 43 min, that is, the whole duration of the experiment, was found both in the right and left amygdala. The habituation peaks were located in the probabilistically defined laterobasal amygdala subregion (Fig. 3b–d). Habituation was more rapid in the right than in the left laterobasal amygdala (sign test, P < 0.05). Moreover, long-term habituation of brain responses was also found in the right and left superior temporal gyrus extending to the primary auditory cortex, in the right inferior frontal cortex (including Broca’s homologue/area 44), and in the left hippocampus region (see Fig. 3a). The results of the long-term habituation are summarized in Tables 3 and 4 in the Supplementary Material. No differential long-term habituation by comparing fast with slow piano melodies (collapsing consonant and dissonant melodies) as well as by comparing consonant with dissonant melodies (collapsing fast and slow melodies) were found.
Significant short-term habituation occurring within 24 s, that is, the duration of the individual piano melodies, was found in the probabilistically defined primary auditory cortex extending to the superior temporal gyrus (Rademacher et al. 2001; see Fig. 3e). Peak MNI coordinates and anatomical assignment using the probabilistic anatomical map of the primary auditory cortex are summarized in Table 5 in the Supplementary Material. There was no significant sensitization, that is, no BOLD response increase, over time course of the experiment in any brain region.
Sustained brain responses during the whole time of the experiment, that is, responses showing neither significant fast nor long-term habituation in response to piano music perception, were found in the left ventrolateral prefrontal cortex extending into the anterior insula and Broca’s area (area 44/45) and in its homologue on the right hemisphere (area 45) at P > 0.3 level (t-test, uncorrected, cluster size > 50 voxels; see Fig. 3f and Table 6 in the Supplementary Material).
Finally, we found positive correlations with the STAI-S and the individual contrast music perception versus baseline in the left parietal lobule and left area 2 (P < 0.05, FDR-corrected, cluster size > 10 voxels, Table 7 in the Supplementary Material) and positive correlations of the lifetime music education scores and the individual short-term habituation in the right superior temporal gyrus (P < 0.005, uncorrected, cluster size > 50 voxels, masked with the contrast music perception > baseline).
In the present study, we have investigated the neural basis of affect dynamics in the human brain using an integrated approach, tracing temporal changes both in brain responses using BOLD sensitive fMRI and in affective experience using immediate rating of each presented stimulus within the fMRI scanner. We find a decline in the stimulus induced fMRI BOLD signal with repeated auditory stimulation in the probabilistically defined laterobasal amygdala occurring on a time scale of minutes (i.e., within the total 43 min of the investigated period of time). This slow decline was paralleled by a decrease of the amplitude of valence and arousal ratings of the music pieces. Furthermore, the nonprimary auditory cortex (the superior temporal gyrus), the right inferior frontal gyrus including Broca’s area right hemisphere homologue (area 44), and the hippocampus region demonstrated habituation on the same, slow time scale. In contrast, short-term decreases occurring within seconds predominated in the primary auditory cortex and extended to the superior temporal gyrus. Sustained responses throughout the whole investigated period of time were detected in the ventrolateral prefrontal cortex, in the dorsal part of the anterior insular cortex, and in Broca’s area (area 44/45) and its homologue on the right hemisphere (area 45). Together, these findings demonstrate that different time scales of habituation during auditory perception coexisting in the human brain and indicate an amygdalocortical network underlying affective habituation to music. The integrated, neuroimaging-behavioral approach of the present study may in the future be valuable not only to study affect dynamics in healthy subjects but also its possible disturbances in neuropsychiatric disorders.
Consistent with previous neuroimaging studies on music perception in individuals without professional music education, the current investigation identified music-related responses in areas including the auditory cortices, the insular cortex, the left and right inferior frontal cortex including Broca’s area and Broca’s area right hemisphere homologue (area 44/45), the ventrolateral prefrontal cortex, the amygdala, and the hippocampus. In addition, the cerebellum, the SMA, and the caudate nucleus were also found to be activated (Blood and Zatorre 2001; Koelsch et al. 2006; Zatorre et al. 2007). The amygdala has repeatedly shown to be responsive to auditory stimuli-like human vocalization (Sander and Scheich 2001, 2005; Seifritz et al. 2003) and music (Blood and Zatorre 2001; Koelsch et al. 2006; Ball et al. 2007; Mutschler et al. 2008). The cerebellum, the SMA, and the caudate nucleus have been implicated in musical rhythm perception (Chen et al. 2008). Alternatively, the SMA activity observed in the present study might also be related to suppression of movement execution (Rizzolatti et al. 1990; Ball et al. 1999) because listening to music might induce the desire to move (e.g., to tap the rhythm), which subjects had to inhibit during the experiment because they received the explicit instruction to avoid any movement.
Further, Broca’s area and its right hemisphere homologue (area 44 and 45) have been reported as being involved in music perception (Tillmann et al. 2003; Koelsch et al. 2005). Activation in Broca’s area during music perception has been related to music-syntactic processing (Koelsch 2005). The music-related responses that we find in the ventrolateral prefrontal cortex are consistent with animal studies showing that neurons in the ventrolateral prefrontal cortex are selective for complex sounds (Romanski and Goldman-Rakic 2002; Romanski 2007). However, it is still under debate to which extent monkey’s ventrolateral prefrontal cortex is homologue to human ventrolateral prefrontal cortex or rather to Broca’s area (Petrides and Pandya 2002). To our knowledge, the present study is the first to find music perception related responses in the human ventrolateral prefrontal cortex, specifically in a region anterior to Broca’s area located on the lateral convexity of the frontal lobe.
The responses we observed in the left anterior insular cortex are in line with a recent meta-analysis of brain imaging studies showing that the dorsal anterior part of the anterior insula is reproducibly involved in auditory processing such as in the perception of vocalizations and music (Mutschler et al. 2009). After having delineated this highly plausible network of music-related brain areas (see Fig. 2), we have investigated the dynamics of the BOLD responses in these areas on different scales of time.
We find significant long-term habituation effects bilaterally localized in the amygdala, in the nonprimary auditory cortex extending to primary auditory cortex, in the right hemisphere Broca’s homologue (area 44), and in the hippocampus region (see Fig. 3a), suggesting that these brain areas might constitute a functional network. These hypotheses are supported by anatomical studies showing that both the hippocampus and the nonprimary auditory cortex are anatomically connected to the amygdala (McDonald 1998), and there are temporal–frontal projections to Broca’s homologue (Glasser and Rilling 2008). According to a widely applied model of amygdala function, a short latency thalamic pathway supplies the amygdala with a crudely analyzed version of sensory inputs allowing for a fast response, while a long latency cortical pathway provides an elaborately processed version of the incoming information to the amygdala (LeDoux 2000). The analysis of the auditory input taking place in Broca’s area, which has been described being related to music syntax processing (Tillmann et al. 2003) together with the superior temporal region, which is also assumed to be involved in the processing of complex sounds (Rauschecker and Scott 2009), likely belongs to the cortical pathway feeding highly processed auditory information to the amygdala complex.
Amygdala habituation in response to complex auditory stimuli might reflect the fact that the sensory stimuli are no longer of relevance for the individuals because the amygdala is thought to be essential in evaluating stimulus salience and in initiating behavioral responses (Sander et al. 2003). This assumption is supported by our data: valence and arousal ratings induced by the musical pieces decreased during the experiment (i.e., they tended towards neutral). In particular, arousal ratings decreased over the whole time of the experiment (43 min) and valence ratings declined within the first 20 min. Our behavioral findings support previous results showing that affective reactions decline over time in response to repeated sensory exposure (Dijksterhuis and Smith 2002). Moreover, our findings indicate a neural basis for these habituation effects of emotional ratings: We show that ratings of valence and arousal declined on a similar time scale as BOLD responses of the amygdala, which is thought to encode an integrated representation of valence and arousal of sensory stimuli (Winston et al. 2005). In the present study, long-term habituation patterns were modeled by a linear decrease both in BOLD response amplitude. The choice of this linear approach is supported by the findings of Leventhal et al. (2007), specifically demonstrating a linear decrease in affect responses to repeated exposure of pleasurable stimuli.
The peaks of long-term habituation effects were bilaterally localized in the probabilistically defined laterobasal amygdala. This result is in line with previous animal research: in mice, habituation of single-cell activity in responses to sounds was found within the laterobasal amygdala (Herry et al. 2007). Furthermore, studies in various animal species consistently show that the majority of sensory, including auditory, afferents project to the laterobasal amygdala subregion (Bordi and LeDoux 1992; McDonald 1998, 2003). The more rapid habituation rate for the right in comparison to the left amygdala (Phillips et al. 2001; Wright et al. 2001) might also explain why across studies, the left amygdala is more often found to be activated than the right amygdala (Baas et al. 2004; Ball et al. 2009). The finding that the left amygdala demonstrates slower habituation across various sensory modalities is in agreement with the suggestion by Hardee et al. (2008) that the left amygdala is involved in stimuli processing in a more elaborative way than the right amygdala and might therefore remain longer involved.
Parallel to the network demonstrating habituation on a time scale of minutes, there were no significant increases (sensitization) in music-related responses on the same time scale detectable in the present study. Two different brain mechanisms mediating affective habituation have been previously proposed: The first mechanism assumes that the emotional network including the amygdala may be extrinsically suppressed by another brain system showing gradually increasing activity (Feinstein et al. 2002). Alternatively, the second mechanism proposes that the emotional circuits including the amygdala show intrinsic habituation without being extrinsically suppressed by another brain system showing increased activation (Hatta et al. 2006). The present results are in favor of the second hypothesis, that is, they suggest that emotional habituation is not dependent on increased responses in more cognitive brain regions suppressing the emotional system. However, particular care has to be taken in interpretation of negative findings. The fact that we do not find any sign of neuronal sensitization may be due to the limitations of the fMRI method (Logothetis 2008) or detection of (weak) sensitization effects might require a larger sample of subjects to be investigated. Therefore, further studies using fMRI but also other, for example, electrophysiological methods will be required for evaluating the 2 proposed mechanisms of affective habituation.
In contrast to the long-term decreases observed in the nonprimary auditory cortex, we find short-term decreases of the BOLD signal in response to music in the probabilistically defined primary auditory cortex extending to the superior temporal gyrus (see Fig. 3e). Neural habituation effects within seconds have been obtained for pure tones in the primary auditory cortex of anesthetized cats (Ulanovsky et al. 2003, 2004) and awake monkeys (Micheyl et al. 2005). In terms of the underlying mechanisms, it is unlikely that the observed short-time decreases in the primary auditory cortex can be fully explained by sensory adaptation, which is defined as the decrease in a sensory response over time in the presence of a constant stimulus (Eatock 2000), in particular as hair cells in the inner ear exhibit sensory adaptation that occurs on a much shorter time scale, that is, within milliseconds, than our decreases during music presentation that occurred within seconds (Eatock 2000). Interestingly, subjects with a higher lifetime music education score demonstrated faster short-term habituation in the superior temporal gyrus. Musical training has been shown to induce functional and anatomical changes in the human brain (Munte et al. 2002; Fujioka et al. 2006). The present findings indicate that changes in auditory habituation, in particular, in fast habituation processes in primary auditory and surrounding areas, are part of the changes induced by long-term musical training. These habituation changes in subjects with more music education may have a functional significance for the musical skills that are acquired through musical training.
Studies in monkeys and humans indicate a hierarchical organization of the cortical auditory system: the primary auditory cortex—or auditory core area—represents the first cortical stage of sound processing. The secondary or belt areas that surround the primary auditory cortex are thought to be especially involved in processing of complex sounds with more temporal structure and a broader frequency spectrum than simple single-frequency tones (Rauschecker and Scott 2009). The belt areas and the adjacent regions of the superior temporal gyrus demonstrate stronger fMRI responses to band-pass noise compared with pure tones (Seifritz et al. 2006) and have been implicated in the processing of complex sounds such as speech (Fecteau et al. 2005) and animal vocalizations (Altmann et al. 2007).
In the present study, we investigated response to a highly complex and varied auditory stimulus, that is, excerpts of classical piano music in different variation of harmony and tempo. Previously, Seifritz et al. (2002) investigated the spatial and temporal patterns of neural processing in the human auditory cortex using less complex sounds, finding that the contribution of a sustained responses component became less predominant as one moves from the primary auditory cortex, or core area, to the surrounding belt area of the auditory cortex, while the opposite was found for a transient response component at stimulus onset. Together with our present study, these findings indicate that the temporal patterns of habituation processes in the human brain substantially depend on stimulus complexity. The predominance of a transient response to simple tones in the auditory belt area (Seifritz et al. 2002) and of a slowly habituating response to complex sounds in the belt and adjacent areas (present study) might thus reflect a preference for complex sound processing. In summary, the present findings are in agreement with the idea of hierarchically auditory processing in the cortical auditory system (Rauschecker and Scott 2009).
In the present study, we also find sustained responses, that is, responses with neither short nor long-term habituation nor sensitization throughout the whole investigated period of time (43 min). Such sustained responses were found in Broca’s area (area 44/45) and its homologue on the right hemisphere (area 45), in the left ventrolateral prefrontal cortex, and in the left anterior insula (see Fig. 3f). Because subjects had to rate each melody after presentation throughout the whole experiment, the sustained responses in the insula might reflect task-set processing (Dosenbach et al. 2006) related to the rating task. The ventrolateral prefrontal responses might be explained by the working memory demands of the rating task (Arnott et al. 2005), which remained constant throughout the experiment. Interestingly, a recent study in macaque monkeys showed that neurons in the prefrontal cortex demonstrated only weak size-contingent repetition effects (Verhoef et al. 2008).
In summary, the present study delineates a neural basis for the habituation effects of emotional experience in humans. We show that ratings of valence and arousal declined on a similar, slow time scale as habituation of responses of the laterobasal amygdala. Our results demonstrate a temporal specificity of BOLD fMRI to disentangle habituation processes on different scales of time. Electrophysiologically, however, habituation effects have been shown after even shorter stimulus repetition times (Fischer et al. 2003; Weiland et al. 2008). The results of the present study demonstrate that different time scales of habituation coexist during the perception of music in an amygdalocortical network that might support a hierarchical model of complex auditory processing. From a psychological perspective, it has been assumed that the mechanisms underlying habituation have a functional purpose by protecting the organism from flooding with irrelevant sensory information by allocating resources to new salient stimuli in the environment (Siddle 1991). It has been hypothesized that it is maladaptive not to habituate (Dijksterhuis and Smith 2002) as, for example, in patients with anxiety disorders (Protopopescu et al. 2005; Shin et al. 2005). Future studies could therefore utilize stimuli with varying complexity/biological relevance in order to investigate adaptive and disturbed habituation processes as they have been proposed to be critical in psychiatric disorders. Employing such habituation paradigms in an integrated neuroimaging-behavioral fashion as in the present study may be a valuable approach for such future investigations of the neuronal mechanisms of affect dynamics in the human brain.
Swiss National Science Foundation (grant 51A240-104890); and the VolkswagenStiftung (grant I/83 078) within the European Platform.
The authors thank Manuela Keckeis from the MR Physics, Freiburg for support during the fMRI experiments, Prof. Wilfried Gruhn from the University of Music in Freiburg and Christoph Kaller from the Department of Neurology, Freiburg for help regarding music selection. Furthermore, we thank the “YAMAHA Stiftung 100 Jahre e.V.” for providing the MR-compatible headphones. Conflict of Interest: None declared.