Abstract

Music constitutes an ideal means to create a sense of suspense in films. However, there has been minimal investigation into the underlying cerebral organization for perceiving danger created by music. In comparison, the amygdala's role in recognition of fear in non-musical contexts has been well established. The present study sought to fill this gap in exploring how patients with amygdala resection recognize emotional expression in music. To this aim, we tested 16 patients with left (LTR; n = 8) or right (RTR; n = 8) medial temporal resection (including amygdala) for the relief of medically intractable seizures and 16 matched controls in an emotion recognition task involving instrumental music. The musical selections were purposely created to induce fear, peacefulness, happiness and sadness. Participants were asked to rate to what extent each musical passage expressed these four emotions on 10-point scales. In order to check for the presence of a perceptual problem, the same musical selections were presented to the participants in an error detection task. None of the patients was found to perform below controls in the perceptual task. In contrast, both LTR and RTR patients were found to be impaired in the recognition of scary music. Recognition of happy and sad music was normal. These findings suggest that the anteromedial temporal lobe (including the amygdala) plays a role in the recognition of danger in a musical context.

Introduction

Music is very effective in inducing fear. Imagine viewing a film such as Jaws or Psycho with no musical soundtrack: this experience would be much sedated. Indeed, music has been shown to greatly amplify and generate feelings of suspense in the context of films (Cohen, 2001). Yet music has not been considered in the neuropsychological study of fear perception, although its associated neural correlates (such as the amygdala) have been documented in non-musical contexts, particularly for facial expressions (Adolphs et al., 1994, 1995; Young et al., 1995, 1996; Breiter et al., 1996, Experiment 1; Calder et al., 1996; Morris et al., 1996, 1998; Phillips et al., 1997; Broks et al., 1998; Whalen et al., 1998; Adolphs et al., 1999b, 2001). Hence, the goal of the present study was to examine the recognition of threat as expressed by music in patients with damage to the amygdala.

The amygdala has been identified as one of the most important brain structures of the limbic system that is implicated in fear perception. Bilateral amygdala damage impairs the recognition of unpleasant emotions, especially fear, in facial expressions (Adolphs et al., 1994, 1995; Young et al., 1995, 1996; Calder et al., 1996; Broks et al., 1998). Such patients can typically perceive happiness but not fear in facial expressions. Additional support for the involvement of the amygdala in perceiving fear in facial expressions comes from recent functional imaging studies of the normal brain. Increased neural activity has been observed within the amygdala when subjects viewed facial expressions of fear as opposed to happiness (Breiter et al., 1996, Experiment 1; Morris et al., 1996, 1998) and to disgust (Phillips et al., 1997).

The involvement of the amygdala in fear-related situations does not appear to be limited to facial expressions. The amygdala also plays a key role in conditioned fear in both humans and rats (e.g. Bechara et al., 1995; Labar et al., 1995; LeDoux, 1996; LaBar et al., 1998) as well as in fear-related behaviours, such as uncontrollable rage reactions (Davis, 1992). Moreover, direct stimulation of the human amygdala with in-depth electrodes evokes similar reactions (Halgren et al., 1978), providing further support for the idea that the amygdala is critical in the processing of emotions related to threat and danger.

Nevertheless, it is worth noting that prior attempts to find evidence for the involvement of the amygdala in the emotional evaluation of auditory stimuli have been equivocal. Some researchers have found positive evidence for the involvement of the amygdala in the auditory recognition of emotions in vocal sounds such as screams and yells (Scott et al., 1997) and speech prosody (Scott et al., 1997; Morris et al., 1999; Phillips et al., 1998) while still others have failed to do so (Anderson and Phelps, 1998; Adolph and Tranel, 1999; Royet et al., 2000; Adolphs et al., 2001). In the latter studies, all of which employed a patient-based approach, bilateral damage to the amygdala yielded an impairment in the recognition of fear in faces, while it spared the recognition of fearful voices. Similarly, listening to unpleasant musical events, as created by dissonance, does not activate the amygdala (Blood et al., 1999). This lack of convergence in recruiting the amygdala across auditory communication channels is intriguing. It suggests that either the amygdala's role is limited to facial expressions, or that vocal and musical expressions of emotions are not very effective in signalling threat. The present study aimed to shed some light on this issue by exploring yet another way to induce fear, namely through the composer's intention to evoke a sense of suspense with musical sounds.

With this type of musical material, we tested 16 patients after unilateral medial temporal lobe excision for the relief of intractable epilepsy. These excisions typically remove the amygdala and surrounding neural tissue. In addition to the amygdala, closely associated structures in the anterior and medial temporal lobe are likely to play a role in mediating the elicitation of emotional responses, as suggested in the classic study of Kluver and Bucy (1939). The temporal pole and rhinal cortex, which are connected to the amygdala, have been shown to modulate emotional behaviours in animal studies (Aggleton and Young, 2000). Hence, patients who have undergone surgery of these structures are likely to exhibit an emotional deficit. Since such surgery is relatively common and has been shown to result in deficits in fear recognition in facial expressions (Anderson et al., 2000; Adolphs et al., 2001; Bruton et al., 2003), aversive taste (Small et al., 2001), conditioned fear (Labar et al., 1995; Funayama et al., 2001) and enhanced attention for aversive words (Anderson and Phelps, 2001), the evaluation of such patients was our choice for testing the presence of an analogous deficit in the emotional processing of music.

These patients were presented with 56 musical excerpts composed with the intention of inducing fear, peacefulness, happiness and sadness, as if presented in film soundtracks. The patients' task was to perform three different emotional judgements. The first judgement was an emotion classification task which required each participant to judge to what extent the music expressed each labelled emotion (threat, peacefulness, happiness and sadness) on a 10-point scale where 0 corresponded to ‘absent’ and 9 to ‘present’. This procedure was similar to that used in prior studies of emotion recognition in facial expressions (e.g. Adolphs et al., 1994) and in prosody (e.g. Adolphs and Tranel, 1999), in order to enable comparisons. The second and third judgements were more general and required participants to judge each music excerpt for its arousal (from ‘relaxing’ to ‘stimulating’) and its valence (from ‘unpleasant’ to ‘pleasant’) on distinct 10-point scales. The latter judgements were included so as to be able to relate our findings to the fact that damage to the amygdala may impair arousal judgements while sparing valence judgements in the recognition of facial expressions of fear (Adolphs et al., 1999a). Finally, in order to be able to distinguish between an emotional deficit and a perceptual disturbance in processing the musical stimuli, the patients' ability to process the same musical excerpts was assessed by way of an error detection task.

In line with the literature on facial expression recognition, patients with unilateral temporal lobe excisions encroaching on the amygdala were expected to exhibit a deficit in recognizing danger (as compared to any other emotion) in a musical context.

Method

Participants

Sixteen patients with unilateral anteromedial temporal lobe resection for the relief of medically intractable epilepsy and 16 normal controls participated in this study. The patients were operated at La Salpêtrière Hospital (Paris). All had a medial temporal lobe resection, including the whole amygdala as well as various amounts of the hippocampus and surrounding cortices (entorhinal, perirhinal and parahippocampal cortex) in the left temporal lobe (LTR, n = 8) or in the right temporal lobe (RTR, n = 8). The excision also included the temporal pole in all of our patients, except for two RTR patients. However, the removal never encroached into the superior temporal gyrus. High-resolution MRI volumetric measurements of medial temporal lobe structures were carried out using a protocol initially proposed by Hasboun et al. (1996) and Insausti et al. (1998). The patients underwent imaging with a 1.5-T MRI scanner using a standard head coil and tilted coronal 3D magnetization-prepared rapid acquisition gradient-echo sequence with the following parameters: 14.3/6.3/1 (repetition time/echo time/excitation). This resulted in 124 contiguous T1-weighted partitions with a 1.5 mm section thickness oriented perpendicular to the long axis of hippocampus. Table 1 shows the remaining volumes of the temporopolar cortex, the perirhinal cortex, the entorhinal cortex, the parahippocampal cortex and the hippocampus in 15 patients (LTR = 8; RTL = 7); t-test comparisons confirmed that the remaining volumes of each medial temporal lobe structure from the resected side did not differ between the two patient groups (all P > 0.05). An illustration of a representative excision is displayed in Fig. 1.

Fig. 1

MRI (coronal view) of a representative patient with a right anteromedial temporal resection at the level of the amygdala.

Fig. 1

MRI (coronal view) of a representative patient with a right anteromedial temporal resection at the level of the amygdala.

Table 1

Remaining volumes expressed in cm3 for each patient with left (LTR) and right (RTR) temporal lobe resection

Patient
 
Temporopolar
 
Perirhinal
 
Entorhinal
 
Parahippocampal
 
Hippocampus
 
LTR01 10.7 145.0 
LTR02 16.2 1.4 19.2 115.5 
LTR03 67.0 48.2 54.7 192.2 
LTR04 47.7 19.8 18.3 143.3 
LTR05 29.4 31.7 166.5 
LTR06 31.7 165.8 
LTR07 33.2 36.3 10.9 188.8 
LTR08 71.7 37.8 22.0 238.8 
Mean 33.1 18.0 24.9 169.5 
RTR01 6.8 169.9 
RTR02 132.5 138.2 19.8 30.9 205.3 
RTR03 5.6 65.9 211.7 
RTR04 279.7 191.7 32.7 34.4 185.6 
RTR05 20.6 169.2 
RTR06 57.8 18.2 38.1 148.9 
RTR07 62.2 
Mean 58.9 60.1 10.1 24.2 164.7 
Patient
 
Temporopolar
 
Perirhinal
 
Entorhinal
 
Parahippocampal
 
Hippocampus
 
LTR01 10.7 145.0 
LTR02 16.2 1.4 19.2 115.5 
LTR03 67.0 48.2 54.7 192.2 
LTR04 47.7 19.8 18.3 143.3 
LTR05 29.4 31.7 166.5 
LTR06 31.7 165.8 
LTR07 33.2 36.3 10.9 188.8 
LTR08 71.7 37.8 22.0 238.8 
Mean 33.1 18.0 24.9 169.5 
RTR01 6.8 169.9 
RTR02 132.5 138.2 19.8 30.9 205.3 
RTR03 5.6 65.9 211.7 
RTR04 279.7 191.7 32.7 34.4 185.6 
RTR05 20.6 169.2 
RTR06 57.8 18.2 38.1 148.9 
RTR07 62.2 
Mean 58.9 60.1 10.1 24.2 164.7 

Unilateral resection of the amygdala was complete in each patient.

In the majority of cases, the cause of the seizures was hippocampal sclerosis dating from birth or early life. Patients with electroencephalographic abnormality, evidence of fast-growing tumours, diffuse cerebral damage, hearing loss or impairment on standard audiometric assessment were excluded from the study. Similarly, subjects with atypical speech representation (as determined by intracarotid sodium Amytal testing; Wada and Rasmussen, 1960) and full-scale IQ (Wechsler Adult Intelligence Scale—Revised) score under 75 were excluded. Patients were tested between 1.2 and 8.3 years postoperatively and they were all seizure-free at the time of testing. The normal controls (NC) had no neurological or psychiatric history and were selected to match the patients as closely as possible in terms of age, sex, education and musical background (Table 2). They were all right-handed, and gave written informed consent before testing, in accordance with the Declaration of Helsinki.

Table 2

Demographic and neuropsychological information for patients with left (LTR) and right (RTR) temporal lobe resections and normal controls (NC)


 
LRT
 
RTR
 
NC
 
Sex    
    Male 
    Female 
Age (years) 40.5 (29–47) 40.3 (30–60) 38.6 (27–47) 
Education (years) 12.3 (9–16) 13.4 (9–17) 13.4 (9–19) 
IQ 94 (79–114) 98 (79–112) – 
Musical education (years) 0.6 (0–5) 

 
LRT
 
RTR
 
NC
 
Sex    
    Male 
    Female 
Age (years) 40.5 (29–47) 40.3 (30–60) 38.6 (27–47) 
Education (years) 12.3 (9–16) 13.4 (9–17) 13.4 (9–19) 
IQ 94 (79–114) 98 (79–112) – 
Musical education (years) 0.6 (0–5) 

Mean (range) are presented for age, education, intelligence quotient (IQ) and years of musical education.

Emotional task

Fifty-six novel musical excerpts were composed with the intention of inducing or expressing fear, peacefulness, happiness or sadness (14 excerpts per intention) as if the music were part of a film. The musical excerpts followed the rules of the Western tonal system. The stimuli had a regular temporal structure with the exception of a few scary excerpts, as described below. All musical excerpts involved a melody with an accompaniment. The happy excerpts were written in a major mode at an average tempo (metronome marking) of 137 (range 92–196), the melodic line lying in the medium–high pitch range, and the pedal was not used. In contrast, the sad excerpts were written in a minor mode at an average slow tempo (metronome marking 46, range 40–60), with the pedal. The peaceful music was composed in a major mode, had an intermediate tempo (mean metronome marking 74, range 54–100), and was played with pedal and arpeggio accompaniment. The scary music was relatively fast (tempo varied from 96 to 172) and composed with minor chords on the third and sixth degrees, hence implying the use of accidentals. Although most scary excerpts were regular and consonant, a few had irregular rhythms and contain dissonant events (Appendix 1). Since this category of stimuli is more variable in structure and is central to the present study, the musical scores are fully provided in Appendix 1 along with the musical assessment provided by an expert who rated the degree of dissonance, the presence of unexpected events, and the presence of temporal irregularity on three separate five-point scales (1 = consonant, expected, regular; 5 = very dissonant, highly unexpected and irregular). Examples of stimuli for each emotion category can be heard on our web site at www.fas.umontreal.ca/psy/iperetz.html. The stimuli lasted on average 12.4 s (range 9.2–16.4) and were matched in length across the four emotion categories. Short excerpts of the soundtracks of the films Jaws and Schindler's List served as examples in orienting participants to the tasks.

All stimuli were computer-generated on a microcomputer as Musical Instrument Digital Interface (MIDI) files in a piano timbre, each tone having a precise value in terms of pitch and duration, while maintaining a constant intensity and velocity. The MIDI files were digitally recorded onto compact discs and delivered to the participants over two free-field loudspeakers.

Procedure

Participants were presented with the two examples followed by the 56 stimuli presented in one of two different random orders. For each stimulus, they were asked to judge to what extent it expressed each of four emotions (happiness, sadness, threat, peacefulness) by indicating their rating on a 10-point scale (where 0 corresponded to ‘absent’ and 9 to ‘present’). Participants were informed that a musical excerpt could express more than one emotion. For example, participants were asked to rate a peaceful stimulus with respect to happiness (‘gai’), sadness (‘triste’) and threat (‘épeurant’), and not just peacefulness (‘apaisant’). They were further required to judge each stimulus on two distinct dimensions: arousal and valence. For the arousal dimension, participants rated whether the music sounded relaxing or stimulating on a 10-point scale (0 corresponding to ‘relaxant’ and 9 to ‘stimulant’). For valence, subjects rated on a 10-point scale whether the music sounded pleasant or not (0 corresponding to ‘désagréable’ and 9 to ‘agréable’). They gave judgements in this fixed order. In the rare event that the subject requested to hear the stimuli a second time (0 for patients, 11 times for the 16 controls), it was repeated. No feedback was given, with the exception of the two examples. Subjects were tested individually in a 45 min session.

Error detection task

The error detection task was devised with 24 of the 56 stimuli used in the emotional task (six happy, six sad, six scary and six peaceful). These 24 excerpts were modified so as to contain a timing error. This was done by randomly changing the timing of the tone onsets of the leading voice in an entire measure (bar), thereby giving the impression that the pianist was suddenly losing track of what he or she was playing for a short moment. These 24 modified versions were randomly mixed with 24 intact excerpts. The task was to indicate whether the pianist lost track of what he or she was playing at some point in the piece. Participants responded ‘yes’ if they detected an error and ‘no’ otherwise. There were four practice examples. Participants were not informed of the nature of the changes and no feedback was provided, with the exception of the practice examples. The error detection task was presented following the emotional task.

Results

Error detection task

The percentages of hits (‘yes’ responses to the presence of an error) and false alarms (‘yes’ response to an intact stimulus) were computed for each patient and NC responses in the error detection task. The percentage of hits minus false alarms corresponded to 70% (SE 3), 81% (SE 5) and 74% (SE 5) for the LTR, RTR and NC, respectively. Although the LTR patients seemed to perform below the RTR patients and the NC, the difference was not significant. The analysis of variance (ANOVA) computed on the percentages of hits minus false alarms as a function of Group (NC, LTR and RTR) yielded F(2,29) = 0.72 (not significant). In fact, every patient performed within 2 SD of the mean obtained by the matched controls.

Emotional tasks

Since participants were free to select as many of the four emotion labels as they wished. To provide a graded judgement for each, we first derived the best label attributed to each musical excerpt by each participant. This was done by selecting the label that had received the maximal rating. When the maximal rating corresponded to the label that matched the intended emotion of the composer, a score of 1 was given. When the maximal rating did not correspond to the intended emotion, a score of 0 was given. When the highest rating was given for more than one label, the response was considered as ‘ambivalent’ and received a score of 0. For example, when a participant judged a musical excerpt to express both peacefulness and sadness to the same degree (e.g. with a rating of 7), it was considered ambivalent. As can be seen in Table 3, normal participants attributed the highest ratings to the intended emotion for the musical stimuli. Sadness tended to be somewhat confused with peacefulness, whereas threat and happiness were clearly distinguished and identified. Patients' judgements differed somewhat, especially for the scary and peaceful stimuli (Table 3).

Table 3

Mean percentages of the label that received the maximal ratings by (A) the normal controls, (B) theLTR and (C) the RTR patients, as a function of the four intended emotions

 Response
 
    
Intention
 
Threat
 
Peaceful
 
Happy
 
Sad
 
Ambivalent
 
(A) Normal controls      
Scary 86 (2.8
Peaceful 70 (5.116 
Happy 91 (3.7
Sad 16 53 (7.120 
(B) LTR      
Scary 38 (10.7)* 17 38* 
Peaceful 36 (9.4)* 21 13 30 
Happy 85 (11.214 
Sad 13 45 (9.535 
(C) RTR      
Scary 46 (12.1)* 13 12 23* 
Peaceful 46 (10.3)* 14 17 21 
Happy 79 (815 
Sad 26 33 (9.232 
 Response
 
    
Intention
 
Threat
 
Peaceful
 
Happy
 
Sad
 
Ambivalent
 
(A) Normal controls      
Scary 86 (2.8
Peaceful 70 (5.116 
Happy 91 (3.7
Sad 16 53 (7.120 
(B) LTR      
Scary 38 (10.7)* 17 38* 
Peaceful 36 (9.4)* 21 13 30 
Happy 85 (11.214 
Sad 13 45 (9.535 
(C) RTR      
Scary 46 (12.1)* 13 12 23* 
Peaceful 46 (10.3)* 14 17 21 
Happy 79 (815 
Sad 26 33 (9.232 

Bold type indicates the match between responses and intentions (SE in parentheses). Ambivalent responses correspond to highest ratings given to more than one label. An asterisk indicates that the patients' judgements differ significantly from normal controls.

The correct derivations of the intended emotion were submitted to an ANOVA considering Group (LTR, RTR, NC) and intended Emotions (scary, peaceful, happy, sad) as between-subjects and within-subjects factors, respectively. Ambivalent responses were not considered in this analysis; an analysis of these will be presented separately below. The groups recognized the intended emotions differently, as attested by a significant Group × Emotion interaction, with F(6,87) = 2.46, P < 0.05. Scary and peaceful stimuli were less well recognized by both LTR patients [t(22) = 4.27 and 3.48, respectively, both P < 0.005 by bilateral tests] and RTR patients [t(22) = 3.15 and 2.29, respectively, both P < 0.05] compared with normal controls. The difference between the two groups of patients did not reach significance.

However, not all patients performed 2 SD below controls, especially for the peaceful stimuli, for which four LTR and five RTR patients obtained normal scores. For the scary stimuli, two LTR and three RTR patients performed as well as normals. Size and side of resection, sex, age and education do not seem to account for the sparing. Finding epileptic cases with intact emotion recognition after damage to the medial temporal structures is recurrent in the literature (e.g. Anderson et al., 2000; Adolphs et al., 2001), although the origin of this sparing is presently unknown.

In order to better characterize the deficit displayed by the patients in recognizing the scary and peaceful stimuli, their ambivalent responses were further examined. As can be seen in Table 3, patients with both LTR and RTR gave overall more ambivalent responses than normal controls for the scary and peaceful stimuli, with F(2,29) = 7.64, P < 0.005. Furthermore, as can be seen in Fig. 2, LTR patients confounded the scary stimuli [90% of the ambivalent responses included the threat label with sadness (88%), while RTR patients selected the correct threat label only half the time (52% of their ambivalent responses)]. In fact, the RTR patients displayed true ambivalence, by distributing their ratings among all possible choices. Note that peacefulness, which was often selected by RTR for the scary music, was never selected by normals (Table 3). This anomalous pattern of label selection for the scary music was supported by a significant interaction between Group and Emotional label, with F(6,60) = 2.50, P < 0.05. In contrast, the groups did not seem to differ in their ambivalent rating of the peaceful stimuli (Fig. 2, right panel). This was supported statistically; there was no Group effect [F(2,23) = 2.14, not significant] or interaction between Emotional label and Group [F(6,69) = 0.35, not significant] obtained on the ambivalent responses for this class of stimuli. In general, subjects tended to hear sadness in the peaceful music, as indicated by a main effect of emotional labels [F(3,69) = 15.54, P < 0.001]. Participants selected equally often the peaceful and sad labels in their ambivalent ratings [t(50) = 1.71, not significant in a bilateral test], and more so than any other label.

Fig. 2

Mean percentages of emotional labels selected in the ambivalent responses for the scary stimuli (left panel) and the peaceful stimuli (right panel), as a function of group. Ambivalent responses correspond to the cases where two labels have received the highest rating. NC = normal controls; LTR = left temporal resection; RTR = right temporal resection. An asterisk indicates that the patients' responses differ significantly from normal controls.

Fig. 2

Mean percentages of emotional labels selected in the ambivalent responses for the scary stimuli (left panel) and the peaceful stimuli (right panel), as a function of group. Ambivalent responses correspond to the cases where two labels have received the highest rating. NC = normal controls; LTR = left temporal resection; RTR = right temporal resection. An asterisk indicates that the patients' responses differ significantly from normal controls.

In order to control for the possibility that individuals may have been using the rating scale differently, we computed Pearson correlations between the rating profile each participant gave to each stimulus on all four emotional labels with the mean rating profile given to that stimulus by the 16 normal controls. This correlation is identical to those used by researchers in previous studies of facial emotion recognition with brain-damaged patients; such a correlation measure gives a lower variance and avoids possible floor and ceiling effects compared with raw ratings (e.g. Adolphs and Tranel, 1999; Adolphs et al., 2001). Correlations near 1 indicate that patients rated the stimuli normally; correlations near 0 indicate that they rated the stimuli very abnormally. To average the correlation values, the correlations were Z-transformed and averaged over all 14 musical clips that expressed a given emotion, and the average was then inverse Z-transformed to obtain the mean correlation for that emotion. As can be seen in Fig. 3, the correlations indicated that recognition of the scary stimuli was worse among all emotions after both LTR [t(22) = 4.06 P < 0.005] and RTR [t(22) = 2.88 P < 0.05] compared with NC (bilateral test). This difference was supported by a significant Group × Emotion interaction, with F(6,87) = 3.73, P < 0.005. There was no other significant difference. Thus, this analysis revealed a specific impairment in rating the scary stimuli.

Fig. 3

Correlation of participant's rating with the mean ratings given by normal controls for the four intended emotions. Black squares indicate means given by 16 normal controls (each normal control's ratings were correlated with the mean ratings of the other 15 subjects). Correlation of ratings given by patients with left (graphic) and right (○) temporal resection with the mean ratings given by normal controls. An asterisk indicates where the patients' ratings differ significantly from normal ratings.

Fig. 3

Correlation of participant's rating with the mean ratings given by normal controls for the four intended emotions. Black squares indicate means given by 16 normal controls (each normal control's ratings were correlated with the mean ratings of the other 15 subjects). Correlation of ratings given by patients with left (graphic) and right (○) temporal resection with the mean ratings given by normal controls. An asterisk indicates where the patients' ratings differ significantly from normal ratings.

Because the scary stimuli were variable in structure, containing varying degrees of dissonance and irregularity, it was deemed worthwhile to examine the possible influence of these different structural features on the subjects' responses. Unfortunately, no specific contribution of these different structural features could be discerned in the subjects' evaluation of the stimuli, as can be seen in Appendix 1.

There were two other emotional judgements, in terms of valence and arousal, that were potentially informative. Since patients might simply be more conservative in their ratings, the individual scores were transformed to Z scores relative to the individual's own mean and SD of their rating distribution across the arousal and valence scales (Fig. 4). These Z scores were submitted to an ANOVA, considering the emotional Dimension (arousal and valence) and intended Emotion (fear, peacefulness, happiness, sadness) as within-subjects factor and Group (LTR, RTR, NC) as the between-subjects factor. This analysis yielded a significant interaction between these three factors [F(6,87) = 3.61, P < 0.005]. Separate ANOVAs performed by Dimension revealed that only arousal was judged differently by the patients; the interaction between Emotion and Group was significant, with F(6,87) = 3.41, P < 0.01. As can be seen in Fig. 4, the RTR patients found the scary music less stimulating and the sad music less relaxing than normal controls [t(22) = 2.21 and 2.27, respectively, P < 0.05 by bilateral tests], and the LTR patients found the peaceful music to be less relaxing than normal controls [t(22) = 2.40 P < 0.05]. However, RTR and LTR ratings were not significantly different. On the other hand, valence was evaluated similarly by patients and normal controls; there was no effect of Group or interaction between Group and Emotion on the valence ratings [F(2,29) = 1.90 and F(6,87) = 1.38, respectively, both P > 0.05], but there was a main effect of Emotion [F(3,87)=61.64, P < 0.001]. All participants judged the scary music to be unpleasant compared with the other categories (all t values being significant at P < 0.001), and the happy and peaceful music to be more pleasant than the sad music [t(31) = 4.18 and 7.25, respectively, both P < 0.001].

Fig. 4

Mean ratings and standard errors expressed as Z scores for arousal (A) and valence (B) as a function of the four intended emotions and group. NC = normal controls; LTR = left temporal resection; RTR = right temporal resection. An asterisk indicates that the patients' ratings differ significantly from normal controls' ratings.

Fig. 4

Mean ratings and standard errors expressed as Z scores for arousal (A) and valence (B) as a function of the four intended emotions and group. NC = normal controls; LTR = left temporal resection; RTR = right temporal resection. An asterisk indicates that the patients' ratings differ significantly from normal controls' ratings.

Discussion

The results show that recognition of music composed with the intention to be scary can be impaired by unilateral medial temporal lobe excision. The impairment is relatively selective because recognition of happiness was normal, and recognition of peacefulness and sadness in music was less clearly affected by the medial temporal lobe resection. The disorder does not seem to reflect task difficulty because the scary stimuli were generally easy to identify by normals (with 86% correct recognition of the intended emotion). Hence, patients having sustained such a removal, particularly on the right side of the brain, seem to have lost the knowledge of what signals danger as attested by their aberrant choice of peacefulness or happiness as the intended emotion. This atypical behaviour does not seem to arise as a consequence of a poor perceptual system either. All patients managed to obtain a fairly high level of performance in an error detection task that used the same stimuli as in the emotional task; patients' scores in error detection did not differ from those of normal subjects. Thus, the results indicate that unilateral excision of the medial temporal lobe structures results in a relatively selective emotion recognition deficit that reflects damage to neural networks that are tuned to the recognition of danger.

The most likely neural locus underlying this disorder is the amygdala, which was completely removed in all patients, although damage was not limited to this structure. To warrant the conclusion that it is the removal of the amygdala, and not the surrounding neural structures that are responsible for the emotion recognition disorder observed here, one needs to study patients with lesion limited to the amygdala.

On the other hand, resection of the medial temporal lobe structures was also found to impair arousal judgements. The RTR patients found the scary and sad music to be generally less extreme in arousal and LRT patients found the peaceful music less relaxing than did the normal controls. One common explanation for these abnormal emotional judgements is that patients might find the scary stimuli less threatening than normals not because they cannot perceive these as scary but rather because they are not as aroused by the music compared with normals. However, if this were the case, then the patients should have had problems with the happy stimuli because these were even more arousing than the scary stimuli. Yet, patients performed as normals in recognizing happiness in a musical context. It would seem that these two types of emotional judgements, by discrete category and by dimension, respectively, are relatively independent. However, scary music might be related to arousal in a slightly different manner. Scary music might be more salient (for normal individuals) because of its biological significance. Indeed, it has been proposed that the primary role of the human amygdala is to enhance the perception of stimuli that have emotional salience in order to achieve awareness (Anderson and Phelps, 2001). Perhaps the scary stimuli are the most motivationally significant stimuli in the present set of stimuli because they signal highly aversive events (e.g. such as in the films Psycho and Jaws). Happy stimuli would be inoffensive in this respect, and hence be less salient. In failing to note the saliency of the scary stimuli, due to a diminished arousal level, patients with medial temporal lobe lesion may not predict the advent of potential danger on the basis of music.

In contrast, resection of the medial temporal lobe structures spares pleasantness judgements. Patients judged the different types of stimuli as pleasant and unpleasant as normal controls did. Finding that arousal (which was impaired) and pleasantness (which was spared) are dissociable in emotional judgements is not a new finding. For example, it has been recently shown that the amygdala responds more strongly to the emotional arousal elicited by high-arousal odours but does not differentiate between the pleasantness or unpleasantness of the odours (Anderson et al., 2003). This arousal-dependent response fits nicely with the present results found with music and with similar arousal-dependent amygdala responses reported previously with visual stimuli (Canli et al., 2000; Hamann, 2001) and with gustatory stimuli (Small et al., 2003). Pleasantness judgements seem to rely on different brain structures (e.g. the orbitofrontal cortex; Blood et al., 1999; Blood and Zatorre, 2001; Anderson et al., 2003; Small et al., 2003). These regions were intact in our patients. Hence, the present results are consistent with functional segregation between the amygdala and the orbitofrontal cortex in terms of the primary dimensions of arousal and valence.

The fact that pleasantness judgements were spared by medial temporal lobe resection may in turn explain why recognition of most musical emotions tested here was preserved. Indeed, sad music is generally perceived as pleasant or ambiguous on this dimension, because listeners enjoy listening to sad music while aware that the concept of sadness has negative connotations. Hence, peacefulness, happiness and, to an important extent, sadness are all perceived as agreeable in music and are expected to involve more frontal structures, which were spared by the resection in our patient population.

One caveat is that there was only one category of negative emotion considered in the present study. Yet music can be aversive in a number of ways, for instance by expressing violence or anger, among other feelings. The reason we did not consider a ‘violence’ category in the present set of stimuli is because violent music can induce fear in the listener, hence making the emotion labelling task ambiguous. This is probably the reason why musical effectiveness in denoting anger is generally low (Gabrielsson and Juslin, 2003). The challenge for future research is to be able to generate musical stimuli that are considered both unpleasant and non-aversive, so as to be able to assess the contribution of the medial temporal structure to negative emotions in general, not just to danger, as expressed by music.

Nevertheless, the major contribution of the present study is related to the power of music to suggest danger. In the past, most studies which have explored the neural correlates of fear have studied the visual modality. Moreover, as recently pointed out by Adolphs and Tranel (2003), within the visual modality, faces appear to be the prime emotional stimuli to trigger the involvement of the amygdala. Converging visual signals, such as body postures, are not as effective. The fact that scary music was also effective in triggering the amygdala and its surrounding structures raises a number of interesting questions. First, it is important to point out that the intention to induce fear was determined both by the internal structure of the music (e.g. irregularity, dissonance) and to some extent by the manner in which it was played (e.g. tempo). In theory, in music one can separate the structural determinants (e.g. mode, dissonance) from its acoustic realization (e.g. tempo). This is similar to the speech situation, whereby one can pronounce a sentence with a particular tone of voice (prosody) while its meaning expresses a different emotion (semantics). Hence, it is possible that the present set of musical stimuli, which combined both forms of emotion expression, are more effective in signalling danger than prosody alone, as manipulated in prior studies that failed to obtain evidence for the amygdala involvement with similar patient populations (e.g. Adolphs et al., 2001). Secondly, extending prior findings obtained with facial expressions of emotions to the study of musical expression of emotions raises the issue of the origins of this effect. There is no straightforward way to account for this effect in evolutionary terms. Music does not appear to be designed to induce fear in humans. Unless music was used by our ancestors to frighten away evil spirits or the enemy, it is more likely that the recognition of musical expression of danger is learned by association. This may commence at an early age and be rooted in the infant's response to the mother's voice expressing disapproval. Indeed, scary music may mimic prosodic cues used to code danger (Juslin and Laukka, 2003). Be it learned or innately determined, the perception of danger links music with biologically important functions via its common recruitment of the amygdala and its surrounding structures.

Note

Terms such as ‘scary’, ‘threat’ and ‘danger’ are all used to refer to the emotion expressed by the music that was written with the intention to induce fear. These terms are all broad descriptors that are not meant to be precise, because we believe that our stimuli do not refer to what is commonly felt as fearful.

Appendix 1

Each scary stimulus is represented in musical notation, with its associated rating in terms of dissonance, of unexpectedness and of irregularity. Order of presentation follows these ratings, with the highest values given to the last stimuli. The mean percentage of best label derivation and the mean arousal rating given by the 16 normal controls and the 16 patients are also provided for each stimulus.

graphic

We thank Bernard Bouchard for composing the musical stimuli and the patients for their cooperation. The work was supported by a grant from the Natural Science and Engineering Research Council of Canada to I.P. and by a postgraduate scholarship from the Canadian Fonds de la Recherche en Santé du Québec to N.G.

References

Adolphs R, Tranel D. Intact recognition of emotional prosody following amygdala damage.
Neuropsychologia
 
1999
;
37
:
1285
–92.
Adolphs R, Tranel D. Amygdala damage impairs emotion recognition from scenes only when they contain facial expressions.
Neuropsychologia
 
2003
;
41
:
1281
–9.
Adolphs R, Tranel D, Damasio H, Damasio AR. Impaired recognition of emotion in facial expressions following bilateral damage to the human amygdala.
Nature
 
1994
;
372
:
669
–72.
Adolphs R, Tranel D, Damasio H, Damasio AR. Fear and the human amygdala.
J Neurosci
 
1995
;
15
:
5879
–92.
Adolphs R, Russell JA, Tranel D. A role for the human amygdala in recognizing emotional arousal from unpleasant stimuli.
Psychol Sci
 
1999
;
10
:
167
–71.
Adolphs R, Tranel D, Hamann S, Young AW, Calder AJ, Phelps EA, et al. Recognition of facial emotion in nine individuals with bilateral amygdala damage.
Neuropsychologia
 
1999
;
37
:
1111
–7.
Adolphs R, Tranel D, Damasio H. Emotion recognition from faces and prosody following temporal lobectomy.
Neuropsychology
 
2001
;
15
:
396
–404.
Aggleton JP, Young AW. The enigma of the amygdala: on its contribution to human emotion. In: Lane RD and Nadel L, editors. Cognitive neuroscience of emotion: series in affective science. Oxford: Oxford University Press;
2000
: p. 106–28.
Anderson AK, Phelps EA. Intact recognition of vocal expressions of fear following bilateral lesions of the human amygdala.
Neuroreport
 
1998
;
9
:
3607
–13.
Anderson AK, Phelps EA. Lesions of the human amygdala impair enhanced perception of emotionally salient events.
Nature
 
2001
;
411
:
305
–9.
Anderson AK, Phelps EA, Spencer DD, Fulbright RK. Contribution of the anteromedial temporal lobes to the evaluation of facial emotion.
Neuropsychology
 
2000
;
14
:
526
–36.
Anderson AK, Christoff K, Stappen I, Panitz D, Ghahremani DG, Glover G, et al. Dissociated neural representations of intensity and valence in human olfaction.
Nat Neurosci
 
2003
;
6
:
196
–202.
Bechara A, Tranel D, Damasio H, Adolphs R, Rockland C, Damasio AR. Double dissociation of conditioning and declarative knowledge relative to the amygdala and hippocampus in humans.
Science
 
1995
;
269
:
1115
–8.
Blood AJ, Zatorre RJ. Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion.
Proc Natl Acad Sci USA
 
2001
;
98
:
11818
–23.
Blood AJ, Zatorre RJ, Bermudez P, Evans AC. Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions.
Nat Neurosci
 
1999
;
2
:
382
–7.
Breiter HC, Etcoff NL, Whalen PJ, Kennedy WA, Rauch SL, Buckner RL, et al. Response and habituation of the human amygdala during visual processing of facial expression.
Neuron
 
1996
;
17
:
875
–87.
Broks P, Young AW, Maratos EJ, Coffey PJ, Calder AJ, Isaac C, et al. Face processing impairments after encephalitis, amygdala damage and recognition of fear.
Neuropsychologia
 
1998
;
36
:
59
–70.
Bruton LA, Wiatt G, Rabin L, Frohlich J, Bernstein V, Susan DD, et al. Perception and priming of affective faces in temporal lobectomy patients.
J Clin Exp Neuropsychol
 
2003
;
25
:
348
–60.
Calder AJ, Young AW, Rowland D, Perrett DI, Hodges JR, Etcoff NL. Facial emotion recognition after bilateral amygdala damage; differentially severe impairment of fear.
Neuropsychology
 
1996
;
13
:
699
–745.
Canli T, Zhao Z, Brewer J, Gabrieli JD, Cahill LJ. Event-related activation in the human amygdala associates with later memory for individual emotional experience.
J Neurosci
 
2000
;
20
:
1
–5.
Cohen A. Music as source of emotion in film. In: Juslin PN, Sloboda JA, editors. Music and emotion: Theory and research. New York: Oxford University Press;
2001
. p. 249–72.
Davis M. The role of the amygdala in fear and anxiety.
Annu Rev Neurosci
 
1992
;
15
:
353
–75.
Funayama ES, Grillon C, Davis M, Phelps EA. A double dissociation in the affective modulation of startle in humans: effects of unilateral temporal lobectomy.
J Cogn Neurosci
 
2001
;
14
:
721
–9.
Gabrielsson A, Juslin PN. Emotional expression in music. In: Davidson RJ, Goldsmith HH, Scherer KR, editors. Handbook of affective sciences. New York: Oxford University Press;
2003
: p. 503–34.
Halgren E, Walter R, Cherlow D, Crandall P. Mental phenomena evoked by electrical stimulation of the human hippocampal formation and amygdala.
Brain
 
1978
;
101
:
83
–117.
Hamann S. Cognitive and neural mechanisms of emotional memory.
Trends Cogn Neurosci
 
2001
;
5
:
394
–400.
Hasboun D, Chantôme M, Zauaoiu A, Sahel M, Deladoeuille M, Sourour N, et al. MR determination of hippocampal volume: comparison of three methods.
Am J Neuroradiol
 
1996
;
17
:
905
–9.
Insausti R, Juottonen K, Insausti AM, Partanen K, Vainio P, Laasko MP, et al. Volumetric analysis of the human, entorhinal, perirhinal, and temporopolar cortices.
Am J Neuroradiol
 
1998
;
19
:
659
–71.
Juslin PN, Laukka P. Communication of emotions in vocal expression and music performance: different channels, same code?
Psychol Bull
 
2003
;
129
:
770
–814.
Kluver H, Bucy PC. Preliminary analysis of functions of the temporal lobes in monkeys.
Arch Neural Psychiatr
 
1939
;
42
:
979
–97.
Labar D, Le Doux J, Spencer DD, Phelps EA. Impaired fear conditioning following unilateral temporal lobectomy in humans.
J Neurosci
 
1995
;
15
:
6846
–55.
LaBar KS, Gatenby JC, Gore JC, LeDoux JE, Phelps EA. Human amygdala activation during conditioned fear acquisition and extinction: a Mixed-trial fMRI study.
Neuron
 
1998
;
20
:
937
–45.
LeDoux JE. The emotional brain: the mysterious underpinnings of emotional life. New York: Touchstone;
1996
.
Morris JS, Frith CD, Perrett DI, Young AW, Calder AJ, Dolan RJ. A differential neural response in the human amygdala to fearful and happy facial expressions.
Nature
 
1996
;
383
:
812
–5.
Morris JS, Friston CD, Büchel C, Frith CD, Young AW, Calder AJ, et al. A neuromodulatory role for the human amygdala in processing emotional facial expressions.
Brain
 
1998
;
121
:
47
–57.
Morris JS, Scott SK, Dolan RJ. Saying it with feeling: neural responses to emotional vocalizations.
Neuropsychologia
 
1999
;
37
:
1155
–63.
Phillips ML, Young AW, Senior C, Brammer M, Calder AJ, Bullmore ET, et al. A specific neural substrate for perceiving facial expressions of disgust.
Nature
 
1997
;
389
:
495
–8.
Phillips ML, Young AW, Scott SK, Calder AJ, Andrew C, Giampietro V, et al. Neural response to facial and vocal expressions of fear and disgust.
Proc R Soc Lond B Biol Sci
 
1998
;
265
:
1809
–17.
Royet J-P, Zald D, Versace R, Costes N, Lavenne F, Koenig O, et al. Emotional responses to pleasant and unpleasant olfactory, visual, and auditory stimuli: A positron emission tomography study.
J Neurosci
 
2000
;
20
:
7752
–9.
Scott SK, Young AW, Calder AJ, Hellawell D, Aggleton JP, Johnson M. Impaired auditory recognition of fear and anger following bilateral amygdala lesions.
Nature
 
1997
;
385
:
254
–7.
Small DM, Zatorre RJ, Jones-Gotman M. Increased intensity perception of aversive taste following right anteromedial temporal lobe removal in humans.
Brain
 
2001
;
124
:
1566
–75.
Small DM, Gregory MD, Mak YE, Gitelman D, Mesulam MM, Parrish T. Dissociation of neural representation of intensity ans affective valuation in human gustation.
Neuron
 
2003
;
39
:
701
–11.
Wada J, Rasmussen T. Intracarotid injection of sodium amytal for the lateralization of cerebral speech dominance: Experimental and clinical observations.
J Neurosurg
 
1960
;
17
:
266
–82.
Whalen PJ, Rauch SL, Etcoff NL, McInerney SC, Lee MB, Jenike MA. Masked presentations of emotional facial expression modulate amygdala activity without explicit knowledge.
J Neurosci
 
1998
;
18
:
411
–8.
Young AW, Aggleton JP, Hellawell DJ, Johnson M, Broks P, Hanley JR. Face processing impairments after amydalectomy.
Brain
 
1995
;
118
:
15
–24.
Young AW, Hellawell D, Van de Wal C, Johnson M. Facial expression after amydalectomy.
Neuropsychologia
 
1996
;
34
:
31
–9.

Author notes

1Department of Psychology, University of Montreal, 2Music Department, Concordia University, Montreal, 3Departments of Epilepsy and 4Neuroradiology, La Salpêtrière Hospital, 5CNRS–UPR640, 6INSERM–U739, Paris and 7Department of Psychology (URECA), Université de Lille 3, Villeneuve d'Ascq, France