Abstract

We aimed at testing the cortical representation of complex natural sounds within auditory cortex by conducting 2 human magnetoencephalography experiments. To this end, we employed an adaptation paradigm and presented subjects with pairs of complex stimuli, namely, animal vocalizations and spectrally matched noise. In Experiment 1, we presented stimulus pairs of same or different animal vocalizations and same or different noise. Our results suggest a 2-step process of adaptation effects: first, we observed a general item-unspecific reduction of the N1m peak amplitude at 100 ms, followed by an item-specific amplitude reduction of the P2m component at 200 ms after stimulus onset for both animal vocalizations and noise. Multiple dipole source modeling revealed the right lateral Heschl's gyrus and the bilateral superior temporal gyrus as sites of adaptation. In Experiment 2, we tested for cross-adaptation between animal vocalizations and spectrally matched noise sounds, by presenting pairs of an animal vocalization and its corresponding or a different noise sound. We observed cross-adaptation effects for the P2m component within bilateral superior temporal gyrus. Thus, our results suggest selectivity of the evoked magnetic field at 200 ms after stimulus onset in nonprimary auditory cortex for the spectral fine structure of complex sounds rather than their temporal dynamics.

Introduction

The characterization of computational processes in the brain is an ultimate goal in the cognitive neurosciences. However, noninvasive functional neuroimaging methods are restricted by their limited spatial resolution. Recently, adaptation paradigms have been widely applied to overcome these limitations and to characterize the stimulus selectivity of cortical areas. These paradigms employ adaptation effects, that is, signal decreases due to stimulus repetition. In the visual domain, functional magnetic resonance imaging (fMRI) adaptation has been employed to characterize processing in both early (Tootell et al. 1998) and higher order visual areas (Grill-Spector et al. 1999, 2006; James et al. 2002; Altmann et al. 2004). Recent magnetoencephalography (MEG) studies characterized the temporal profile of adaptation to visual shapes and suggest a change not only in response amplitude due to adaptation but also in response timing (Noguchi et al. 2004).

In the auditory domain, adaptation paradigms have not been as extensively employed. Human fMRI studies showed evidence for correlations between behavioral effects of repetition priming and repetition-associated reduction of fMRI responses to environmental sounds (Bergerbest et al. 2004). Furthermore, combined human fMRI and electroencephalography (EEG)/MEG studies have provided evidence for stimulus-specific adaptation effects both for pure tones and noise (Jääskeläinen et al. 2004) and vowels (Ahveninen et al. 2006) in nonprimary auditory areas.

However, the temporal profile of adaptation to repeated stimulation with complex nonlanguage-related natural sounds is largely unknown. Furthermore, it is unclear what the acoustic and higher order features are that lead to adaptation in the auditory cortex. Thus, in these experiments, we aimed at investigating the representation of complex natural sounds and spectrally matched noise using an MEG adaptation paradigm. We employed only a single category of natural sounds, namely, animal vocalizations. This class of nonspeech sounds is learned early in development and is associated to perceptually rich and tangible objects in our environment. Neurophysiological studies in the macaque revealed preferential responses of neurons for conspecific monkey vocalizations in nonprimary auditory cortex within the anterior lateral belt (Rauschecker et al. 1995; Rauschecker 1997, 1998; Rauschecker and Tian 2000; Tian et al. 2001). Furthermore, positron emission tomography studies in the macaque showed increased metabolic responses in the bilateral superior temporal gyrus (STG) for general animal vocalizations but left-lateralized responses for conspecific calls (Poremba et al. 2004). In humans, areas within bilateral STG and the superior temporal sulcus (STS) have been suggested to be involved in the processing of human voices and vocalizations (Belin et al. 2000; Fecteau et al. 2004, 2005) and environmental sounds (Giraud and Price 2001; Maeder et al. 2001; Lewis et al. 2004). Recent EEG experiments provided evidence for an early dissociation of processing of animal vocalizations compared with sounds of artificial objects (Murray et al. 2006). In a previous MEG study, changes of animal vocalizations, vowels, and noise resulted in evoked magnetic mismatch fields over bilateral anterior temporal and inferior frontal regions and left-lateralized enhancements of fast oscillatory activity (Kaiser et al. 2002). Similarly, a series of fMRI studies presented subjects with complex natural sounds (animal voices, tools, dropped objects, and liquids) and revealed stronger fMRI activity in the bilateral middle portion of the STG for animal compared with tool sounds (Lewis et al. 2005).

To investigate adaptation processes to animal vocalizations, we conducted 2 experiments. In Experiment 1, we presented subjects with pairs of stimuli, that either consisted of 2 identical or different animal vocalizations or of 2 identical or different noise sounds. If the cortical representation of animal vocalizations is based on their identity or their temporal dynamics, we hypothesized that adaptation effects should occur for the animal vocalizations only, but not for the noise pairs. If, however, adaptation occurs for both categories of stimuli, the observed adaptation effects are possibly based on the spectral fine structure of the stimuli, rather than their temporal properties. Alternatively, adaptation effects could occur for both animal vocalizations and noise stimuli, but at distinct spatial locations or within different time windows suggesting differential representation of the stimuli. Thus, localization of the underlying neural generators is necessary to characterize adaptation to the different stimulus categories. In Experiment 2, we tested for cross-adaptation between animal vocalizations and spectrally matched noise sounds. To this end, we presented pairs of animal vocalizations and a spectrally matched or a nonmatched noise sound. Cross-adaptation would further indicate representation of spectral sound features rather than spectrotemporal features.

Materials and Methods

Subjects

Twelve healthy, right-handed volunteers (age range 23–53, 8 males, 4 females) participated in Experiment 1. One subject was excluded from analysis, because no clear evoked response (N1m and P2m components) to auditory stimulation was obtained. All apart from the excluded subjects participated in Experiment 2 (age range 23–53, 8 males, 3 females). The subjects had normal hearing abilities and gave their informed consent to participate in the study. The experiments were performed in accordance with the ethical standards laid down in the 1964 declaration of Helsinki and approved by the ethics committee of the National Institute for Physiological Sciences, Okazaki, Japan.

Stimuli

Eight different animal vocalizations (cat, cow, dog, horse, owl, pig, sheep, tiger) were taken from a database specifically designed for auditory psychophysics (Marcell et al. 2000) and from a commercial sound CD (Sound Ideas, Richmond Hill, Ontario, Canada). Sounds were digitized with a sampling rate of 22 050 Hz. Sound duration was 500 ms and sound intensity level was at 82 dB. The sounds were equalized as regards their root mean square energy and their amplitude envelope. The normalized amplitude envelope ensured similarity of onset and offset parameters across stimuli for which the evoked magnetic responses are particularly sensitive (Biermann and Heil 2000). The animal vocalizations were chosen from periods of sustained amplitude to avoid the loss of substantial sound information due to the amplitude normalization. This procedure disrupted the overall amplitude envelope but modulations within the different frequency band were still preserved. As depicted in Figure 1, control stimuli were created by filtering white noise with the spectral envelope of the animal vocalizations, thus generating a spectrally matched but temporally distorted and unrecognizable sound. All sounds were presented with air-conducting earphones (E-A-Rtone 3A, Aearo Corporation, Indianapolis, IN). The frequency transformation characteristics of the auditory stimulation system ensured reliable stimulus presentation up to 4.4 kHz. The onsets and offsets of all sounds were smoothed utilizing a Hanning window with a 50-ms rise and decay period.

Figure 1.

Sample stimuli. The top panel shows an animal vocalization (cat) and the bottom panel its spectrally matched control stimulus. The left 2 graphs show the amplitude waveform, the middle graphs show the time–frequency diagram, and the rightmost graphs depict the envelope of the overall power spectrum of the 2 sound stimuli.

Figure 1.

Sample stimuli. The top panel shows an animal vocalization (cat) and the bottom panel its spectrally matched control stimulus. The left 2 graphs show the amplitude waveform, the middle graphs show the time–frequency diagram, and the rightmost graphs depict the envelope of the overall power spectrum of the 2 sound stimuli.

Procedure

Before the MEG experiment, subjects were asked whether they could spontaneously name the animal vocalizations. On average, 6.4/8 animal vocalizations were correctly named (cat 8/11, cow 11/11, dog 11/11, horse 11/11, owl 7/11, pig 6/11, sheep 11/11, tiger 4/11 subjects correctly naming the respective animal), while the noise sounds were not recognizable. Moreover, we tested the subjects' ability to distinguish between animal vocalizations and the spectrally matched noise sound in a 2-alternative forced-choice task. To this end, subjects were presented with pairs of an animal vocalizations and its matched noise sound and were instructed to indicate the presentation of an animal vocalization. Performance was at a high level in this task (percent correct rate 93%), indicating a clear perceptual difference between the 2 sound categories. The MEG experiments consisted of 6 experimental runs with a duration of 4 min. Each trial consisted of a pair of stimuli, the first stimulus (S1) was presented for 500 ms, followed by an interstimulus interval of 500 ms, then the second stimulus (S2) was presented for 500 ms, followed by a 1500-ms silent period. In Experiment 1, the stimuli were paired according to the following conditions: 1) 2 identical animal vocalizations, 2) 2 different animal vocalizations, 3) 2 identical noise sounds, and 4) 2 different noise sounds. In Experiment 2, we paired the stimuli according to the following conditions: 1) an animal vocalization and the spectrally matched noise sound, 2) an animal vocalization and a different noise sound, and 3) and 4) were similar to 1) and 2) but with reversed stimulus order. Additionally, we introduced a target condition in which a pure tone (1000 Hz) was randomly paired with a noise stimulus or an animal vocalization. The pure tone appeared randomly either as S1 or S2 stimulus and was matched in length and average root mean square to the animal vocalizations and noise stimuli. Subjects were instructed to press a button whenever they heard the 1000-Hz pure tone. Subjects performed at a high level at the detection task during the experiment (correct hits: 93% in both experiments). Within a run, we presented each condition and the target condition 16 times. Thus, for each condition, we acquired 96 trials.

MEG Acquisition and Data Analysis

The magnetic responses to the auditory stimuli were recorded with a helmet-shaped 306-channel detector array (Vectorview; ELEKTA Neuromag, Helsinki, Finland), which comprised 102 identical triple sensor elements. Each sensor element consisted of 2 orthogonal planar gradiometers and one magnetometer coupled to a multi-SQUID (superconducting quantum interference device) and thus provided 3 independent measurements of the magnetic fields. In the present study, we analyzed MEG signals recorded from 204-channel planar-type gradiometers. The signals from these sensors are strongest when the sensors are located just above local cerebral sources (Nishitani and Hari 2002). The MEG signals were recorded with 0.1–200 Hz band-pass filters and digitized at 1 kHz. Before MEG recordings, 4 head position indicator (HPI) coils were placed at specific sites on the scalp. To determine the exact head location with respect to the MEG sensors, electric current was fed to the HPI coils, and the resulting magnetic fields were measured with the magnetometer. These procedures allowed for alignment of the individual head coordinate system with the magnetometer coordinate system. The locations of HPI coils with respect to the 3 anatomical landmarks (nasion and bilateral) were also measured using a 3-dimensional digitizer to align the coordinate systems of MEG with magnetic resonance (MR) images obtained with a 3-T magnetic resonance imaging system (Allegra; Siemens, Erlangen, Germany). We adopted a head-based coordinate system used in a previous study (Wasaka et al. 2003). In this coordinate system, the x-axis was fixed with the preauricular points, with the positive direction to the right. The positive y-axis passed through the nasion from posterior to anterior, and the z-axis thus pointed upward.

The signals in the 4 conditions were averaged separately, time locked to the onset of the S1 stimuli. The averaging epoch ranged from 100 ms before to 2000 ms after the S1 onset, and the prestimulus period (initial 100 ms) was used as the baseline. Epochs in which signal variation was larger than 3000 fT were excluded from averaging. The averaged responses were low-pass filtered with a cut-off frequency of 40 Hz, employing a zero-phase shift Butterworth filter. To avoid motor contamination, only responses to nontarget pairs were used in the analysis. For each subject, vector sums were calculated from the longitudinal and latitudinal derivates of the responses recorded on the planar gradiometers at each of the 102 recording sites. Vector sums were computed by squaring the sum of MEG signals of each gradiometer pair and then calculating the square root of this sum (Bonte et al. 2006). For an initial overview of the evoked magnetic fields, we computed areal averages across 24 gradiometer pairs in left and right temporal lobe, respectively, similar to previous studies (Tarkiainen et al. 2003). The peak amplitude and latency of the N1m component were determined for each subject by evaluating a 60-ms window centered on 100 ms after stimulus onset. Accordingly, the P2m peak latency and amplitude were determined within a 100-ms window centered on 200 ms after stimulus onset.

Source locations and the time courses of source activities for each individual subject were determined using multiple source analysis and brain electric source analysis (NeuroScan, Mclean, VA), as described previously (Inui et al. 2004, 2006). The multiple dipole models were obtained for the evoked magnetic field in response to the S1 stimulation, in particular within the time range from 0 to 300 ms after S1 onset. Single dipoles were added successively to model the evoked magnetic fields. The model adequacy was assessed by examining 1) the percent variance (Hari et al. 1988), 2) the F-ratio (ratio of reduced chi-square values before and after adding a new source) (Supek and Aine 1993), and 3) residual waveforms (i.e., the difference between the recorded data and the model). Channels that exhibited an excessive noise level were excluded from analysis (average: 4.3/204 channels per subject). Goodness-of-fit values for the N1m and P2m components in response to the S1 stimulation were above 80% for all subjects and conditions. The anatomical sites of the sources were determined by coregistration with the individual subjects' anatomical MR image.

To assess the differences across conditions for the peak amplitudes and peak latencies of the vector sums over the left and right temporal lobes, we employed a repeated measurement analysis of variance (ANOVA) with factors component (N1m/P2m), hemisphere (left/right), stimulus category (animal/noise), and repetition (same/different) for Experiment 1. In Experiment 2, we conducted a repeated measurement ANOVA with factors component (N1m/P2m), hemisphere (left/right), stimulus order (animal vocalization as first stimulus/animal vocalization as second stimulus), and repetition (same/different).

Results

Experiment 1: Adaptation Effects for Animal Vocalizations and Noise Sounds

As shown in Figure 2, in Experiment 1 we observed 2 prominent components following S1 and S2 for the evoked magnetic responses averaged across left and right temporal areas, respectively. First, an N1m component occurred with a peak latency at about 105 (Standard deviation ±11) ms after S1 stimulus onset and second, a P2m component with a peak latency at about 211 (±16) ms after S1 onset. Similar components were observed in response to S2 presentation, that is an N1m component peaking at about 108 (±16) ms after S2 onset and a P2m component with a peak latency of about 216 (±22) ms after S2 onset. The peak amplitudes were reduced for the S2 compared with the S1 response. In particular, the N1m component exhibited a reduction of about 18%, whereas the P2m component was reduced by about 13%. Employing a 4-way repeated measurement ANOVA, we observed a significant main effect for repetition (F1,10 = 8.85, P < 0.05) and an interaction between component and repetition (F1,10 = 15.64, P < 0.01). The reductions from S1 to S2 showed no differences across conditions for the N1m component (hemisphere: F1,10 = 1.02, P = 0.34; stimulus category: F1,10 < 1, P = 0.95; repetition: F1,10 < 1, P = 0.69). However, the P2m component clearly showed stronger reductions for same compared with different stimuli (repetition: F1,10 = 15.00, P < 0.01) but no effects of hemisphere (F1,10 = 3.37, P = 0.10) or stimulus category (F1,10 < 1, P = 0.73). No peak latency differences were observed for either the N1m or the P2m component in response to S1 or S2 (P > 0.05 for all repeated measurement ANOVAs with factors stimulus category and condition). Thus, our data suggest that stimulus repetition results in a general, item-unspecific reduction of evoked magnetic fields for the N1m component. More specifically, the N1m component of the S2 evoked magnetic response was reduced both when the same or a different stimulus was repeated. In contrast, amplitude reductions for the P2m component in both left and right temporal lobe areas were item specific, that is, reductions occurred only when the same item was repeated.

Figure 2.

Averaged vector sums of the evoked magnetic fields in Experiment 1. (a) Group-averaged (n = 11) time courses of the evoked magnetic fields for left and right temporal sensors. (b) Average S1–S2 reduction of the peak amplitudes for the evoked magnetic field. Error bars depict mean standard errors.

Figure 2.

Averaged vector sums of the evoked magnetic fields in Experiment 1. (a) Group-averaged (n = 11) time courses of the evoked magnetic fields for left and right temporal sensors. (b) Average S1–S2 reduction of the peak amplitudes for the evoked magnetic field. Error bars depict mean standard errors.

Multiple dipole modeling revealed sources in the bilateral lateral Heschl's gyrus (HG) in all subjects. Within these HG sources, a clear N1m component was observed for all subjects and a P2m component was observed for 8/11 in left HG and 9/11 subjects in right HG. Furthermore, we observed dipole sources in the left STG for 7/11, in right STG for 9/11, in left planum temporale for 2/11 and right planum temporale for 3/11, in the left posterior parietal cortex for 3/11, and in the right anterior insula for 2/11 subjects. On average, we fitted 5.3 (minimum: 4, maximum: 7) dipole sources to the magnetic field evoked by S1 presentation for each single subject. While Figure 3 shows the evoked magnetic field topography, Figure 4 depicts the corresponding dipole models for 2 representative subjects.

Figure 3.

Topography of the evoked magnetic field for 2 representative subjects. (a and b) The topography of the evoked magnetic field for the N1m and P2m component superimposed onto a standard head surface.

Figure 3.

Topography of the evoked magnetic field for 2 representative subjects. (a and b) The topography of the evoked magnetic field for the N1m and P2m component superimposed onto a standard head surface.

Figure 4.

Dipole models for 2 representative subjects. (a and c) The location of the estimated dipoles (condition: animal vocalization different) superimposed on the individual subject's cortical surface reconstruction. Sources within lateral HG are depicted in red, STG sources are depicted in light blue, and all other estimated dipoles are depicted in green. Dashed circles show the location of occluded source centers. (b and d) The source waveforms for the estimated dipoles (HG and STG).

Figure 4.

Dipole models for 2 representative subjects. (a and c) The location of the estimated dipoles (condition: animal vocalization different) superimposed on the individual subject's cortical surface reconstruction. Sources within lateral HG are depicted in red, STG sources are depicted in light blue, and all other estimated dipoles are depicted in green. Dashed circles show the location of occluded source centers. (b and d) The source waveforms for the estimated dipoles (HG and STG).

The left HG source exhibited an N1m component at about 113 (±14) ms and the right HG source at about 112 (±12) ms after S2 onset and was followed by a P2m component at 211 (±33) ms in left HG and at 223 (±27) ms for the right HG. For the left STG source, we observed a P2m component at about 215 (±28) ms after S2 onset and for the right STG at 209 (±22) ms. As shown in Figure 5a, we did not observe significant differences across conditions for the N1m component in HG (left HG—stimulus category: F1,10 = 1.22, P = 0.29; repetition: F1,10 < 1, P = 0.90; interaction: F1,10 < 1, P = 0.91; right HG—stimulus category: F1,10 < 1, P = 0.98; repetition: F1,10 = 3.46, P = 0.09; interaction: F1,10 < 1, P = 0.95). Interestingly, the P2m component within the right HG exhibited significantly larger amplitude reductions when the same stimuli were repeated (F1,7 = 9.13, P < 0.05) and a tendency for such a repetition effect within the left HG (F1,8 = 4.94, P = 0.06). For both left and right STG, we observed a significant item-specific adaptation effect for the P2m component (left STG: F1,6 = 6.58, P < 0.05; right STG: F1,8 = 5.15, P < 0.05), but no main effect for the stimulus category (left STG: F1,6 < 1, P = 0.43; right STG: F1,8 < 1, P = 0.86), and no interaction between the factors repetition and stimulus category (left STG: F1,6 < 1, P = 0.75; right STG: F1,8 < 1, P = 0.78). There were no significant effects for the peak latencies across conditions (P > 0.05 for all repeated measurement ANOVAs). Thus, item-specific adaptation effects were observed for the P2 component amplitude in the bilateral STG and right lateral HG.

Figure 5.

(a) Average S1–S2 reduction of peak amplitudes for the source dipole moments in Experiment 1. For the left and right HG sources, both the N1m and P2m amplitude reductions are shown (left 4 bar graphs). For left and right STG source, the P2m amplitude reductions are shown (right 2 bar graphs). Error bars depict mean standard errors across subjects. (b) Source dipole locations for all subjects averaged across conditions for the HG and STG dipoles in a head coordinate system. The upper 2 graphs show the position in the xy coordinate plane and the lower 2 graphs in the yz coordinate plane.

Figure 5.

(a) Average S1–S2 reduction of peak amplitudes for the source dipole moments in Experiment 1. For the left and right HG sources, both the N1m and P2m amplitude reductions are shown (left 4 bar graphs). For left and right STG source, the P2m amplitude reductions are shown (right 2 bar graphs). Error bars depict mean standard errors across subjects. (b) Source dipole locations for all subjects averaged across conditions for the HG and STG dipoles in a head coordinate system. The upper 2 graphs show the position in the xy coordinate plane and the lower 2 graphs in the yz coordinate plane.

As depicted in Figure 5b, the STG sources were localized mainly lateral to the HG sources. The position of the STG sources varied between subjects along the anterior–posterior axis. As shown in Table 1, the spatial location of the dipole sources did not show significant differences across conditions (P > 0.05 for all 2-way repeated measurement ANOVAs with factors stimulus category and repetition).

Table 1

Head-related coordinates in mm ± standard deviation averaged across subjects for the HG and STG sources in Experiment 1

Source Condition x y z 
HG, left Animal, same −49 ± 7 13 ± 7 63 ± 9 
 Animal, different −51 ± 8 13 ± 7 63 ± 7 
 Noise, same −48 ± 6 13 ± 6 62 ± 8 
 Noise, different −49 ± 10 14 ± 7 62 ± 6 
HG, right Animal, same 52 ± 6 20 ± 5 60 ± 6 
 Animal, different 51 ± 6 18 ± 7 60 ± 6 
 Noise, same 52 ± 5 19 ± 6 60 ± 6 
 Noise, different 54 ± 4 19 ± 7 62 ± 4 
STG, left Animal, same −56 ± 10 13 ± 17 59 ± 7 
 Animal, different −55 ± 10 15 ± 17 62 ± 9 
 Noise, same −55 ± 9 14 ± 17 60 ± 8 
 Noise, different −55 ± 10 12 ± 18 61 ± 9 
STG, right Animal, same 56 ± 7 11 ± 15 56 ± 10 
 Animal, different 55 ± 7 12 ± 14 56 ± 10 
 Noise, same 56 ± 6 13 ± 15 54 ± 8 
 Noise, different 57 ± 7 13 ± 14 57 ± 9 
Source Condition x y z 
HG, left Animal, same −49 ± 7 13 ± 7 63 ± 9 
 Animal, different −51 ± 8 13 ± 7 63 ± 7 
 Noise, same −48 ± 6 13 ± 6 62 ± 8 
 Noise, different −49 ± 10 14 ± 7 62 ± 6 
HG, right Animal, same 52 ± 6 20 ± 5 60 ± 6 
 Animal, different 51 ± 6 18 ± 7 60 ± 6 
 Noise, same 52 ± 5 19 ± 6 60 ± 6 
 Noise, different 54 ± 4 19 ± 7 62 ± 4 
STG, left Animal, same −56 ± 10 13 ± 17 59 ± 7 
 Animal, different −55 ± 10 15 ± 17 62 ± 9 
 Noise, same −55 ± 9 14 ± 17 60 ± 8 
 Noise, different −55 ± 10 12 ± 18 61 ± 9 
STG, right Animal, same 56 ± 7 11 ± 15 56 ± 10 
 Animal, different 55 ± 7 12 ± 14 56 ± 10 
 Noise, same 56 ± 6 13 ± 15 54 ± 8 
 Noise, different 57 ± 7 13 ± 14 57 ± 9 

Experiment 2: Cross-Adaptation between Animal Vocalizations and Noise Sounds

In Experiment 2, we tested for cross-adaptation between animal vocalizations and noise sounds. A 4-way repeated measurements ANOVA on the N1m and P2m components revealed a significant main effect for repetition (F1,10 = 5.08, P < 0.05) and an interaction between component and repetition (F1,10 = 5.15, P < 0.05). As depicted in Figure 6a,b, the reductions from S1 to S2 showed no differences across conditions for the N1m component (hemisphere: F1,10 < 1, P = 0.71; stimulus order: F1,10 < 1, P = 0.76; repetition: F1,10 = 1.94, P = 0.19). However, the P2m component showed cross-adaptation effects between animal vocalizations and their matched noise sounds (repetition: F1,10 = 6.61, P < 0.05) but no effects of hemisphere (F1,10 < 1, P = 0.55) or stimulus order (F1,10 < 1, P = 0.89).

Figure 6.

Averaged vector sums of the evoked magnetic fields in Experiment 2. (a) Group-averaged (n = 11) time courses of the evoked magnetic fields for left and right temporal sensors. (b) Average S1–S2 reduction of the peak amplitudes for the evoked magnetic field. (c) Average S1–S2 reduction of peak amplitudes for the source dipole moments in Experiment 2. For the left and right HG sources, both the N1m and P2m amplitude reductions are shown (left 4 bar graphs). For left and right STG source, the P2m amplitude reductions are shown (right 2 bar graphs). Error bars depict mean standard errors across subjects.

Figure 6.

Averaged vector sums of the evoked magnetic fields in Experiment 2. (a) Group-averaged (n = 11) time courses of the evoked magnetic fields for left and right temporal sensors. (b) Average S1–S2 reduction of the peak amplitudes for the evoked magnetic field. (c) Average S1–S2 reduction of peak amplitudes for the source dipole moments in Experiment 2. For the left and right HG sources, both the N1m and P2m amplitude reductions are shown (left 4 bar graphs). For left and right STG source, the P2m amplitude reductions are shown (right 2 bar graphs). Error bars depict mean standard errors across subjects.

Similar to Experiment 1, we obtained multiple dipole models for Experiment 2. While clear N1m components were observed in the bilateral HG for all subjects, a P2m component was found in 11/11 for left HG, 10/11 subjects in right HG, 6/11 in left STG, and 9/11 in right STG. As shown in Figure 6c, repetition effects were observed for the P2m component in the bilateral STG only (left STG: F1,5 = 7.45, P < 0.05; right STG: F1,8 = 8.09, P < 0.05) but not for the N1m component in HG (left HG: F1,10 = 3.68, P = 0.08; right HG: F1,10 < 1, P = 0.71) or for the P2m component in HG (left HG: F1,10 = 1.83, P = 0.21; right HG: F1,9 = 3.12, P = 0.11). Additionally, we found an effect of stimulus order for the P2m component within HG (left HG: F1,10 = 9.46, P < 0.05; right HG: F1,9 = 6.64, P < 0.05), that is, more reduction when an animal vocalization was preceded by a noise sound compared with when an animal vocalization preceded a noise sound. Thus, cross-adaptation effects between animal vocalizations and noise sounds were observed for the P2 component amplitude in the bilateral STG, similar to the adaptation effects seen in Experiment 1.

Discussion

Employing an MEG adaptation paradigm, we observed 2 forms of repetition induced response attenuation in the human auditory cortex. First, our data suggest item-unspecific amplitude reductions for the N1m component at about 100 ms after stimulus onset. More specifically, the N1m component showed reductions not only when 2 identical but also when 2 different sounds were repeated. This component was followed by item-specific amplitude reductions of the P2m component at about 200 ms after stimulus onset, that is, reductions occurred only when the same stimulus was repeated. Furthermore, in a second experiment, we observed cross-adaptation between animal vocalizations and spectrally matched noise sounds for the P2m component. Because animal vocalizations and noise sounds shared the overall spectral content, but not temporospectral properties or meaning, we suggest an involvement of the P2m component in spectral processes.

In studies investigating auditory evoked potentials (AEPs), the P2 component is usually observed 150–250 ms after the onset of an auditory stimulus (Crowley and Colrain 2004) and has been shown to exhibit enhancements after training in a pitch discrimination task (Bosnyak et al. 2004) and vowel discrimination (Reinke et al. 2003). Increases of the P2 amplitude have been observed after speech training (Trembley and Kraus 2002), and the P2 amplitude has been suggested to be a marker for musical experience (Shahin et al. 2003). Similarly, the P2m component, the magnetic counterpart of the P2 response as determined in MEG studies has been shown to be enhanced in long-term trained musicians during listening to musical instrument tones (Kuriki et al. 2006). However, previous AEP studies reported modulation of the P2 amplitude not only by discrimination training but also by mere repetition of speech sounds (Sheehan et al. 2005). Furthermore, the P2 amplitude is possibly not only modulated by long-term or short-term plasticity, but it also showed sensitivity to acoustic stimulus features such as the spectral complexity of musical sounds (Shahin et al. 2005).

In the present study, source modeling has suggested that the generators of the P2m are located along the bilateral STG and within the lateral HG. This corroborates previous findings that suggested secondary auditory cortex lateral to HG as generator of the P2 component investigated in AEP studies (Scherg et al. 1989; Picton et al. 1999; Bosnyak et al. 2004; Shahin et al. 2005). MEG studies on complex sound, vowel, and music processing have shown that the sources of the P2m component are anterior to the N1m sources (Tiitinen et al. 1999; Kuriki et al. 2006; Hoshiyama et al. 2007). However, our findings suggest interindividual variability of P2m source localization. More specifically, we observed P2m sources both anterior and posterior to the N1m source. A previous study combined MEG recordings and intracerebral recordings and obtained similar results, that is, source localizations both anterior and posterior to HG (Godey et al. 2001). The authors proposed that multiple sources in the superior temporal cortex might underlie P2m generation.

A recent fMRI study suggested involvement of the left STG in the selective representation of animal vocalizations (Altmann et al. 2007). In contrast to the present study, the natural amplitude envelope of the stimuli was preserved in this fMRI experiment. Adaptation effects were observed for the animal vocalizations only, and no cross-adaptation between animal vocalizations and spectrally matched control stimuli was observed. This suggested selective representation of the spectrotemporal dynamics of the stimuli rather than simple spectral features. In contrast, our present results showed adaptation effects for the P2m component for both the animal vocalizations and spectrally matched noise and cross-adaptation between the 2 sound categories. Thus, the P2m component appears to be more related to the representation of the spectral fine structure of auditory stimuli rather than temporal features. The lack of adaptation effects that are specific to the animal vocalizations might be accounted for by the similarity of the temporal amplitude envelope between animal vocalizations and noise sounds in the present study. The discrepancy between fMRI adaptation effects and the P2m reductions in the STG possibly indicates the colocalization of different processes in higher order auditory areas. Accordingly, areas in the superior temporal lobe have been associated with processing of both temporal and spectral aspects of auditory stimuli. For example, recent fMRI studies have revealed overlapping areas within the lateral HG and the planum temporale that are sensitive to both amplitude and frequency-modulated tones (Giraud et al. 2000; Hart et al. 2003). fMRI studies that investigated brain responses to changes in the spectral envelope of noise and harmonic sounds showed selectivity for the spectral envelope in the right STS (Warren et al. 2005).

The spatial estimation of the N1m component in this study are in line with intracerebral recordings (Godey et al. 2001; Yvert et al. 2005) that suggested auditory cortex in intermediate and lateral HG and the planum temporale as generators. Similarly, combined MEG/fMRI studies employing an fMRI-weighted source estimation approach provided evidence for 2 N1m generators in the anterolateral HG and posterior STG/planum temporal and proposed for the N1m component an important role as a marker for neuronal adaptation (Jääskeläinen et al. 2004; Ahveninen et al. 2006). More specifically, the N1m response has been shown to exhibit stimulus-specific amplitude reductions when pure tones or vowels were repeatedly presented. In contrast, in the present study, adaptation effects were restricted to the P2m response. This finding is in line with recent MEG evidence that provided evidence for P2m-amplitude reductions due to repetition of a complex tone (Hoshiyama et al. 2007). In this study, the N1m was not sensitive to the spectral fine structure of the stimulus. Possibly, the N1m response is determined by both adaptation and lateral inhibition effects. In particular, lateral inhibition occurs between neurons with neighboring frequency tuning as suggested by MEG studies (Pantev et al. 1999, 2004; Okamoto et al. 2004). For the P2m response, such lateral inhibition effects have not been observed (Okamoto et al. 2005), but further research is needed to clarify inhibitory processes related to the P2m component. Thus, due to the broadband spectral properties of the employed stimuli, the N1m might have been affected by both adaptation and inhibition effects, whereas the P2m showed a clear adaptation effect, possibly less affected by lateral inhibition. Alternatively, because the N1m component has been proposed to be particularly sensitive to sound onset parameters (Biermann and Heil 2000), it is also possible that the lack of item-specific adaptation effects for the N1m component can be accounted for by the similarity of sounds regarding their onset parameters. In the present study, all noise stimuli and animal vocalizations had a similar amplitude envelope. Employing sounds with preserved natural amplitude modulation might lead to item-specific adaptation effects for the N1 component.

Thus, although the N1m components generated within the lateral HG did not exhibit item-selective adaptation effects, sources within right lateral HG and along the bilateral STG adapted in a item-specific manner at about 200 ms after stimulus onset. Further research is required to systematically test the sensitivity of the N1m and P2m components to different types of auditory stimuli with varying degrees of acoustic and semantic complexity.

Conclusions

In sum, our study provides evidence for item-specific adaptation effects for the P2m component, at about 200 ms after stimulus onset. Multiple dipole source analysis suggested nonprimary auditory cortex within right lateral HG and along the bilateral STG as the underlying cortical substrate. Item-specific adaptation effects were not limited to animal vocalizations but were also observed for spectrally matched noise. Furthermore, we observed cross-adaptation between animal vocalizations and spectrally matched but spectrotemporally different noise sounds for the P2m component within bilateral STG. These findings suggest that adaptation effects for the P2m component are based on the spectral structure of the auditory stimulus rather than the temporal dynamics or the meaning of the sound. Thus, we propose an implication of the auditory P2m component in the processing of spectrally complex sounds and a role in both short- and long-term plasticity of auditory cortex.

Funding

Japan Society for the Promotion of Science fellowship to C.F.A.

The authors are most grateful to Yasuyuki Takeshima, Osamu Nagata, and Masahiro Hirai for technical assistance. Conflict of Interest: None declared.

References

Ahveninen
J
Jääskeläinen
IP
Raij
T
Bonmassar
G
Devore
S
Hämäläinen
M
Levänen
S
Lin
FH
Sams
M
Shinn-Cunningham
BG
, et al.  . 
Task-modulated “what” and “where” pathway in human auditory cortex
Proc Natl Acad Sci USA
 , 
2006
, vol. 
103
 (pg. 
14608
-
14613
)
Altmann
CF
Deubelius
A
Kourtzi
Z
Shape saliency modulates contextual processing in the human lateral occipital complex
J Cogn Neurosci
 , 
2004
, vol. 
16
 (pg. 
794
-
804
)
Altmann
CF
Doehrmann
O
Kaiser
J
Selectivity for animal vocalizations in the human auditory cortex
Cereb Cortex
 , 
2007
 
doi: 10.1093/cercor/bhl167
Belin
P
Zatorre
RJ
Lafaille
P
Ahad
P
Pike
B
Voice-selective areas in the human auditory cortex
Nature
 , 
2000
, vol. 
403
 (pg. 
309
-
312
)
Bergerbest
D
Ghahremani
DG
Gabrieli
JD
Neural correlates of auditory repetition priming: reduced fMRI activation in the auditory cortex
J Cogn Neurosci.
 , 
2004
, vol. 
16
 (pg. 
966
-
977
)
Biermann
S
Heil
P
Parallels between timing of onset responses of single neurons in cat and of evoked magnetic fields in human auditory cortex
J Neurophysiol
 , 
2000
, vol. 
84
 (pg. 
2426
-
2439
)
Bonte
M
Parviainen
T
Hytönen
K
Salmelin
R
Time course of top-down and bottom-up influences of syllable processing in the auditory cortex
Cereb Cortex
 , 
2006
, vol. 
16
 (pg. 
115
-
123
)
Bosnyak
DJ
Eaton
RA
Roberts
LE
Distributed auditory cortical representations are modified when non-musicians are trained at pitch discrimination with 40 Hz amplitude modulated tones
Cereb Cortex
 , 
2004
, vol. 
14
 (pg. 
1088
-
1099
)
Crowley
KE
Colrain
IM
A review of the evidence for P2 being an independent component process: age, sleep and modality
Clin Neurophysiol
 , 
2004
, vol. 
115
 (pg. 
732
-
744
)
Fecteau
S
Armony
JL
Joanette
Y
Belin
P
Is voice processing species-specific in human auditory cortex? An fMRI study
Neuroimage
 , 
2004
, vol. 
23
 (pg. 
840
-
848
)
Fecteau
S
Armony
JL
Joanette
Y
Belin
P
Sensitivity to voice in human prefrontal cortex
J Neurophysiol
 , 
2005
, vol. 
94
 (pg. 
2251
-
2254
)
Giraud
AL
Lorenzi
C
Ashburner
J
Wable
J
Johnsrude
I
Frackowiak
R
Kleinschmidt
A
Representation of the temporal envelope of sounds in the human brain
J Neurophysiol
 , 
2000
, vol. 
84
 (pg. 
1588
-
1598
)
Giraud
AL
Price
CJ
The constraints functional neuroimaging places on classical models of auditory word processing
J Cogn Neurosci
 , 
2001
, vol. 
13
 (pg. 
754
-
765
)
Godey
B
Schwartz
D
de Graaf
JB
Chauvel
P
Liegeois-Chauvel
C
Neuromagnetic source localization of auditory evoked fields and intracerebral evoked potentials: a comparison of data in the same patients
Clin Neurophysiol
 , 
2001
, vol. 
112
 (pg. 
1850
-
1859
)
Grill-Spector
K
Henson
R
Martin
A
Repetition and the brain: neural models of stimulus-specific effects
Trends Cogn Sci
 , 
2006
, vol. 
10
 (pg. 
14
-
23
)
Grill-Spector
K
Kushnir
T
Edelman
S
Avidan
G
Itzchak
Y
Malach
R
Differential processing of objects under various viewing conditions in the human lateral occipital complex
Neuron
 , 
1999
, vol. 
24
 (pg. 
187
-
203
)
Hari
R
Joutsiniemi
S-L
Sarvas
J
Spatial resolution of neuromagnetic records: theoretical calculations in a spherical model
Electroencephalogr Clin Neurophysiol
 , 
1988
, vol. 
71
 (pg. 
64
-
72
)
Hart
HC
Palmer
AR
Hall
DA
Amplitude and frequency-modulated stimuli activate common regions of human auditory cortex
Cereb Cortex
 , 
2003
, vol. 
13
 (pg. 
773
-
781
)
Hoshiyama
M
Okamoto
H
Kakigi
R
Priority of adaptation to mismatch response following undiscriminable auditory stimulation: a magnetoencephalographic study
Eur J Neurosci
 , 
2007
, vol. 
25
 (pg. 
854
-
862
)
Inui
K
Okamoto
H
Miki
K
Gunji
A
Kakigi
R
Serial and parallel processing in the human auditory cortex: a magnetoencephalographic study
Cereb Cortex
 , 
2006
, vol. 
16
 (pg. 
18
-
30
)
Inui
K
Wang
X
Tamura
Y
Kaneoke
Y
Kakigi
R
Serial processing in the human somatosensory system
Cereb Cortex
 , 
2004
, vol. 
14
 (pg. 
851
-
857
)
Jääskeläinen
IP
Ahveninen
J
Bonmassar
G
Dale
AM
Ilmoniemi
RJ
Levänen
S
Lin
FH
May
P
Melcher
J
Stufflebeam
S
, et al.  . 
Human posterior auditory cortex gates novel sounds to consciousness
Proc Natl Acad Sci USA
 , 
2004
, vol. 
101
 (pg. 
6809
-
6814
)
James
TW
Humphrey
GK
Gati
JS
Menon
R
Goodale
MA
Diferential effects of viewpoint on object-driven activation in dorsal and ventral streams
Neuron
 , 
2002
, vol. 
35
 (pg. 
793
-
801
)
Kaiser
J
Lutzenberger
W
Ackermann
H
Birbaumer
N
Dynamics of gamma-band activity induced by auditory pattern changes in humans
Cereb Cortex
 , 
2002
, vol. 
12
 (pg. 
212
-
221
)
Kuriki
S
Kanda
S
Hirata
Y
Effects of musical experience on different components of MEG responses elicited by sequential piano-tones and chords
J Neurosci
 , 
2006
, vol. 
26
 (pg. 
4046
-
4053
)
Lewis
JW
Brefczynski
JA
Phinney
RE
Janik
JJ
DeYoe
EA
Distinct cortical pathways for processing tool versus animal sounds
J Neurosci
 , 
2005
, vol. 
25
 (pg. 
5148
-
5158
)
Lewis
JW
Wightman
FL
Brefczynski
JA
Phinney
RE
Binder
JR
DeYoe
EA
Human brain regions involved in recognizing environmental sounds
Cereb Cortex
 , 
2004
, vol. 
14
 (pg. 
1008
-
1021
)
Maeder
PP
Meuli
RA
Adriani
M
Bellmann
A
Fornari
E
Thiran
JP
Pittet
A
Clarke
S
Distinct pathways involved in sound recognition and localization: a human fMRI study
Neuroimage
 , 
2001
, vol. 
14
 (pg. 
802
-
816
)
Marcell
MM
Borella
D
Greene
M
Kerr
E
Rogers
S
Confrontation naming of environmental sounds
J Clin Exp Neuropsychol
 , 
2000
, vol. 
22
 (pg. 
830
-
864
)
Murray
MM
Camen
C
Gonzalez Andino
SL
Bovet
P
Clarke
S
Rapid brain discrimination of sounds of objects
J Neurosci
 , 
2006
, vol. 
26
 (pg. 
1293
-
1302
)
Nishitani
N
Hari
R
Viewing lip forms: cortical dynamics
Neuron
 , 
2002
, vol. 
36
 (pg. 
1211
-
1220
)
Noguchi
Y
Inui
K
Kakigi
R
Temporal dynamics of neural adaptation effect in the human visual ventral stream
J Neurosci
 , 
2004
, vol. 
14
 (pg. 
6283
-
6290
)
Okamoto
H
Kakigi
R
Gunji
A
Kubo
T
Pantev
C
The dependence of the auditory evoked N1m decrement on the bandwidth of preceding notch-filtered noise
Eur J Neurosci
 , 
2005
, vol. 
21
 (pg. 
1957
-
1961
)
Okamoto
H
Ross
B
Kakigi
R
Kubo
T
Pantev
C
N1m recovery from decline after exposure to noise with strong spectral contrasts
Hear Res
 , 
2004
, vol. 
196
 (pg. 
77
-
86
)
Pantev
C
Okamoto
H
Ross
B
Stoll
W
Ciurlia-Guy
E
Kakigi
R
Kubo
T
Lateral inhibition and habituation of the human auditory cortex
Eur J Neurosci
 , 
2004
, vol. 
19
 (pg. 
2337
-
2344
)
Pantev
C
Wollbrink
A
Roberts
LE
Engelien
A
Lutkenhöner
B
Short-term plasticity of the human auditory cortex
Brain Res
 , 
1999
, vol. 
842
 (pg. 
192
-
199
)
Picton
TW
Alain
C
Woods
DL
John
MS
Scherg
M
Valdes-Sosa
P
Bosch-Bayard
J
Trujillo
NJ
Intracerebral sources of human auditory evoked potentials
Audiol Neurootol
 , 
1999
, vol. 
4
 (pg. 
64
-
79
)
Poremba
A
Malloy
M
Saunders
RC
Carson
RE
Herscovitch
P
Mishkin
M
Species-specific calls evoke asymmetric activity in the monkey's temporal poles
Nature
 , 
2004
, vol. 
427
 (pg. 
448
-
451
)
Rauschecker
JP
Processing of complex sounds in the auditory cortex of cat, monkey, and man
Acta Otolaryngol
 , 
1997
Suppl. 532
(pg. 
34
-
38
)
Rauschecker
JP
Parallel processing in the auditory cortex of primates
Audiol Neurootol
 , 
1998
, vol. 
3
 (pg. 
86
-
103
)
Rauschecker
JP
Tian
B
Mechanisms and streams for processing of “what” and “where” in auditory cortex
Proc Natl Acad Sci USA
 , 
2000
, vol. 
97
 (pg. 
11800
-
11806
)
Rauschecker
JP
Tian
B
Hauser
M
Processing of complex sounds in the macaque nonprimary auditory cortex
Science
 , 
1995
, vol. 
268
 (pg. 
111
-
114
)
Reinke
KS
He
Y
Wang
C
Alain
C
Perceptual learning modulates sensory evoked response during vowel segregation
Brain Res Cogn Brain Res
 , 
2003
, vol. 
17
 (pg. 
781
-
791
)
Scherg
M
Vajsar
J
Picton
TW
A source analysis of the late human auditory evoked potentials
J Cogn Neurosci
 , 
1989
, vol. 
1
 (pg. 
336
-
355
)
Shahin
A
Bosnyak
DJ
Trainor
LJ
Roberts
LE
Enhancement of neuroplastic P2 and N1c auditory evoked potentials in skilled musicians
J Neurosci
 , 
2003
, vol. 
23
 (pg. 
5545
-
5552
)
Shahin
A
Roberts
LE
Pantev
C
Trainor
LJ
Ross
B
Modulation of P2 auditory-evoked responses by the spectral complexity of musical sounds
Neuroreport
 , 
2005
, vol. 
16
 (pg. 
1781
-
1785
)
Sheehan
KA
McArthur
GM
Bishop
DVM
Is discrimination training necessary to cause changes in the P2 auditory event-related brain potential to speech sounds?
Brain Res Cogn Brain Res
 , 
2005
, vol. 
25
 (pg. 
547
-
553
)
Supek
S
Aine
CJ
Simulation studies of multiple dipole neuromagnetic source localization: model order and limits of source resolution
IEEE Trans Biomed Eng
 , 
1993
, vol. 
40
 (pg. 
529
-
540
)
Tarkiainen
A
Helenius
P
Salmelin
R
Category-specific occipitotemporal activation during face perception in dyslexic individuals: an MEG study
Neuroimage
 , 
2003
, vol. 
19
 (pg. 
1194
-
1204
)
Tian
B
Reser
D
Durham
A
Kustov
A
Rauschecker
JP
Functional specialization in rhesus monkey auditory cortex
Science
 , 
2001
, vol. 
292
 (pg. 
290
-
293
)
Tiitinen
H
Sivonen
P
Alku
P
Virtanen
J
Näätänen
R
Electromagnetic recordings reveal latency differences in speech and tone processing in humans
Brain Res Cogn Brain Res
 , 
1999
, vol. 
8
 (pg. 
355
-
363
)
Tootell
RB
Hadjikhani
NK
Vanduffel
W
Liu
AK
Mendola
JD
Sereno
MI
Dale
AM
Functional analysis of primary visual cortex (V1) in humans
Proc Natl Acad Sci USA
 , 
1998
, vol. 
95
 (pg. 
811
-
817
)
Trembley
KN
Kraus
N
Auditory training induces asymmetrical changes in cortical neural activity
J Speech Lang Hear Res
 , 
2002
, vol. 
45
 (pg. 
564
-
572
)
Warren
JD
Jennings
AR
Griffiths
TD
Analysis of the spectral envelope of sounds by the human brain
Neuroimage
 , 
2005
, vol. 
24
 (pg. 
1052
-
1057
)
Wasaka
T
Hoshiyama
M
Nakata
H
Nishihira
Y
Kakigi
R
Gating of somatosensory evoked magnetic fields during the preparatory period of self-initiated finger movement
Neuroimage
 , 
2003
, vol. 
20
 (pg. 
1830
-
1838
)
Yvert
B
Fischer
C
Bertrand
O
Pernier
J
Localization of human supratemporal auditory areas from intracerebral auditory evoked potentials using distributed source models
Neuroimage
 , 
2005
, vol. 
28
 (pg. 
140
-
153
)