The investigation of functional hemispheric asymmetries regarding auditory processing in the human brain still remains a challenge. Classical lesion and recent neuroimaging studies indicated that speech is dominantly processed in the left hemisphere, whereas music is dominantly processed in the right. However, recent studies demonstrated that the functional hemispheric asymmetries were not limited to the processing of highly cognitive sound signals like speech and music but rather originated from the basic neural processing of elementary sound features, that is, spectral and temporal acoustic features. Here, in contrast to previous studies, we used carefully composed tones and pulse trains as stimuli, balanced the overall physical sound input between spectral and temporal change conditions, and demonstrated the time course of neural activity evoked by spectral versus temporal sound input change by means of magnetoencephalography (MEG). These original findings support the hypothesis that spectral change is dominantly processed in the right hemisphere, whereas temporal change is dominantly processed in the left.
Functional hemispheric asymmetries of human auditory cortices regarding speech processing have actively been investigated since the late 19th century (Broca 1861; Wernicke 1874). Originally, the left hemispheric dominance for speech processing was discovered by behavioral observation of patients with neurological deficits. Eventually, recent neuroimaging techniques enabled the examination of neural activity in the awake human brain. Studies using functional magnetic resonance imaging (fMRI) (Belin et al. 2000) and magnetoencephalography (MEG) (Eulitz et al. 1995; Alho et al. 1998; Szymanski et al. 2001) confirmed that speech is indeed dominantly processed in the left hemisphere. In contrast, music is thought to be dominantly processed in the right hemisphere (Zatorre et al. 1994; Griffiths et al. 1999; Zatorre et al. 2002). However, hemispheric asymmetries might not only be confined to the processing of meaningful sounds like speech and music but may also be manifested during the processing of simple acoustic stimuli which do not convey particular meaning (e.g., pure tones, band-pass noises, pulse trains, etc.).
Basically, the spectral and the temporal features of sounds play decisive roles for auditory encoding and perception (Moore 2003). Behavioral studies have indicated that spectral processing is particularly important for music perception (Vos and Troost 1989; Warrier and Zatorre 2002), whereas temporal cues play an essential role in speech perception (Drullman et al. 1994a, 1994b; Shannon et al. 1995). This disproportionate reliance on spectral and temporal auditory processing capacities of the human brain might cause asymmetric neural activities between hemispheres during speech and music processing.
In principle, however, it is impossible to analyze (or encode) sounds perfectly precise in both the time and frequency domains simultaneously (“Acoustic uncertainty principle”; cf., Joos 1948; Zatorre et al. 2002). For instance, a long time window chosen for sound signal analysis yields high frequency resolution but comparably poor temporal resolution. In contrast, a short time window results in high temporal but relatively poor frequency resolution. Due to this axiomatic uncertainty, there must be a trade-off between spectral and temporal auditory processing precision, which would manifest within the hemispheres and which may differ between the hemispheres, leading to relative temporal or spectral processing hemispheric specialization, respectively.
A recent positron-emission tomography (PET) study (Zatorre and Belin 2001) and a subsequent fMRI (Jamison et al. 2006) study examined the spectral–temporal trade-off hypothesis by measuring the brain activity change in participants who listened to 2 successive series of pure tone sequences. In a “temporal” condition, subjects were presented with 2 pure tones (0.5 and 1 kHz), which alternated with different rates. In the “spectral” condition, alternation rate was held constant, whereas the number of presented tones was varied. The results demonstrated that the core auditory cortex was particularly sensitive to temporal variation, whereas the anterior superior temporal cortex was particularly sensitive to spectral variation. Crucially, temporal information was dominantly processed in the left hemisphere, whereas spectral information was dominantly processed in the right hemisphere. Our recent MEG studies (Okamoto, Stracke, Ross, et al. 2007; Okamoto, Stracke, Wolters, et al. 2007; Stracke et al. 2008) demonstrated left hemispheric dominance also during the processing of tone onsets which were presented simultaneously with background noises. This finding possibly reflects the postulated temporal resolution dominance of the left compared with the right hemisphere (Tallal et al. 1993; Belin et al. 1998; Poeppel 2003; Boemio et al. 2005).
Based on the aforementioned findings, it arises the question whether the hemispheric asymmetries observed for speech and music processing do at least partly reflect the asymmetric processing of low-level acoustic features. Therefore, the goal of the present study was to investigate the hemispheric lateralization of auditory magnetic fields evoked by spectral versus temporal sound input change. Previous studies (Zatorre and Belin 2001; Boemio et al. 2005; Schonwiesner et al. 2005; Jamison et al. 2006) employed block designs in order to estimate brain activity while presenting auditory stimuli characterized by different degrees of spectral and temporal variation. However, in these studies the auditory stimuli were not overall physically identical between blocks; and therefore, these studies did not succeed in isolating brain activity elicited by either spectral or temporal change. Moreover, PET (Kim et al. 2000) and especially fMRI (Amaro et al. 2002) generate acoustic noise during acquisition, which may differently influence the test sound–related neural activity in left and right auditory cortices (cf., Okamoto, Stracke, Ross, et al. 2007; Okamoto, Stracke, Wolters, et al. 2007; Stracke et al. 2008). In contrast, we overcame these drawbacks in the present study by directly measuring population-level neural responses elicited by either spectral or temporal sound input change with millisecond-scale temporal resolution in awake humans by means of acoustically noiseless MEG. Crucially, we used sound stimuli which were overall (i.e., in sum) identical between spectral and temporal change conditions, and thereby controlled for “physical” input differences between conditions. The results of this study were expected to reveal 1) how the left and right hemispheres deal with 2 fundamental acoustic dimensions (spectral vs. temporal) and to uncover 2) the precise time course of neural activity elicited by either spectral or temporal sound input change in a silent environment.
Materials and Methods
In all, 13 healthy subjects participated in experiment 1 (7 females; mean ± standard deviation [SD]: 25.7 ± 1.7 years) and experiment 2 (7 females; mean ± SD: 25.9 ± 2.6 years). Participants had no history of psychological or neurological disorders and were unambiguously right handed (assessed via “Edinburgh Handedness Inventory”; Oldfield 1971). Their hearing thresholds were within a normal range, as tested by clinical pure tone audiometry. Participants gave written informed consent in accordance with procedures approved by the Ethics Commission of the Medical Faculty, University of Muenster.
Stimuli and Experimental Design
Experiment 1 (Tonal Stimuli)
Test stimuli (TS) had duration of 1600 ms (12.5 ms rise and fall times) and were composed of 2 parts. Each part of a TS consisted of one of the following stimuli: 500-Hz pure tone (PT500), 40-Hz 100% amplitude-modulated (AM) tone with carrier frequency of 500 Hz (AM500), 2000-Hz pure tone (PT2000), or 40-Hz AM tone with carrier frequency of 2000 Hz (AM2000). The stimuli were combined to represent 2 conditions: spectral change or temporal change. The stimulus waveforms for both conditions are displayed in Figure 1 and Supplementary Figure S1. In case of spectral change, the first carrier frequency had a 12.5-ms sigmoid offset ramp and the second carrier frequency had a 12.5-ms onset ramp starting at latency 787.5 ms. Thus, the carrier frequency of the TS changed, but the modulation pattern (i.e., the temporal envelope) remained constant (PT500 → PT2000, PT2000 → PT500, AM500 → AM2000, and AM2000 → AM500; audio material clipped onto Supplementary Fig. S1). In case of temporal change, only the modulation pattern of the TS changed from PT to AM or from AM to PT at latency 787.5 ms, whereas the carrier frequency remained constant (PT500 → AM500, PT2000 → AM2000, AM500 → PT500, and AM2000 → PT2000; audio material clipped onto Supplementary Fig. S1). All sound stimuli were prepared as sound files and presented via Presentation (Neurobehavioral Systems Inc., Albany, CA). In all, 18 000-Hz frequency tags were attached to the heads of the stimuli in order to obtain precise timing of the sound presentation. All sounds were presented through 60-cm plastic tubes and silicon earpieces fitting individually to each subject's ears. A hearing threshold for the PT500 was determined before each MEG measurement and PT500 was presented binaurally at intensity level 60 dB above individual sensation level. The maximal amplitudes for PT500, PT2000, AM500, and AM2000 were identical. Consequently, the power was identical between PT500 and PT2000 and 4.3 dB smaller for AM500 and AM2000. In each change condition, 100 stimulus sequences for each TS were presented in random order, resulting in 400 trials. The interstimulus interval between TS was fixed to 0.8 s.
Experiment 2 (Pulse-Train Stimuli)
As in experiment 1, TS were composed of 2 parts. Each 0.75 s part of the TS consisted of one of the following stimuli: 32- or 48-Hz pulse trains, half-octave band-pass filtered either between 2800 and 4000 Hz (TS32Low and TS48Low) or between 4000 and 5600 Hz (TS32High and TS48High). TS changed from the first to the second part at latency of 0.75 s after stimulus onset. As in experiment 1, stimuli were combined to form 2 conditions: spectral change or temporal change. Sound waveforms of spectral change and temporal change conditions are displayed in Figure 2 and Supplementary Figure S2. In the case of spectral change, spectral sound information changed, whereas the temporal pattern remained constant (TS32Low → TS32High, TS32High → TS32Low, TS48Low → TS48High, and TS48High → TS48Low; audio material clipped onto Supplementary Fig. S2). In case of temporal change, only the temporal sound pattern changed (TS32Low → TS48Low, TS32High → TS48High, TS48Low → TS32Low, and TS48High → TS32High; audio material clipped onto Supplementary Fig. S2). All sound stimuli were prepared and presented as in experiment 1. The hearing threshold for TS32Low was determined and the intensity set to 60 dB individual sensation level. The other TS were adjusted to have identical power, resulting in 1.8 dB reduced maximal amplitude for the 48-Hz train pulses compared with the 32-Hz train pulses. In each change condition, 100 stimulus sequences for each TS were presented in random order, resulting in 400 trials. The interstimulus interval between TS was fixed like in the previous experiment to 0.75 s.
Noteworthy, with regard to the temporal change condition, we were unable to equalize both the maximal amplitude and the average power between the first and second stimulus parts at the same time. Therefore, in experiment 1, “maximal TS amplitude” was set identical between first and second stimulus parts. That resulted in power differences between first and second stimulus parts only in the temporal change condition. To avoid the possibility that these power differences influenced the obtained result pattern, the “average TS power” was set identical in experiment 2.
Auditory evoked fields were measured with a helmet-shaped 275-channel whole-head MEG system (VSM Med-Tech Ltd., Coquitlam, BC, Canada) with gradiometer configuration of the pickup coils installed in a quiet and magnetically shielded room. The magnetic field signals were digitally sampled at a rate of 600 Hz. In order to keep subjects in a stable alert state, they watched a silent movie of their choice during the MEG recordings.
Epochs of magnetic field data, ranging from 300 ms preonset to 300 ms postoffset, were averaged selectively for each condition and each experiment after rejection of artifact epochs containing field changes larger than 3 pT. The averaged magnetic field signals were 30-Hz low-pass filtered and baseline corrected based on the 300 ms prestimulus silent interval. The magnetic fields for single equivalent current dipoles (one for each hemisphere), corresponding to the N1m response elicited by the sound changes, were approximated to the averaged magnetic field distribution for each subject and condition (Fig. 3). The 10-ms time window prior to the maximal global field power of the N1m response peaking at around 100 ms after the sound change was used for dipole source estimation. The origins of the dipole locations and orientations were determined at the midpoint of the medial–lateral axis between the center points of the entrances to the ear canals (positive toward the left ear). The posterior–anterior axis ran between nasion and the origin (positive toward the nasion), and the inferior–superior axis ran through the origin perpendicularly to the medial–lateral and inferior–posterior axes (positive toward the vertex). The calculated single equivalent dipole locations were not significantly different between spectral and temporal change conditions. N1m source strength is easily modulated by the depth of the estimated source location (Hillebrand and Barnes 2002). By using identical source locations and orientations between conditions, we could obtain reliable source strengths. Therefore, in order to improve the signal-to-noise ratio and to achieve a reliable comparison between spectral and temporal change conditions, we estimated the single dipoles based on the MEG waveforms grand averaged across all TS and used the identical N1m source location and orientation to calculate the source strength. The estimated source for each subject in each hemisphere was fixed in location and orientation, and the source strengths were calculated for all time points and for each condition. Additionally, in order to examine whether the usage of the single dipole model was appropriate, we calculated sLORETA (Pascual-Marqui 2002; Pascual-Marqui et al. 2002) to assure that no additional sources were concurrently active.
The source strength of the maximal N1m response elicited in each condition (N1m_SC or N1m_TC) and in each hemisphere was normalized with respect to the average of the maximal N1m source strengths ((N1m_SC + N1m_TC)/2) individually. The normalized N1m source strengths and latencies were evaluated separately by means of repeated-measures analysis of variance (ANOVA) using 2 factors (CHANGE CONDITION: Spectral change and Temporal change; HEMISPHERE: Left and Right).
Clearly identifiable auditory evoked fields were obtained from all subjects in both experiments. The goodness of fit of source location estimation was between 94.1% and 99.3% (mean 97.4%) in experiment 1 and between 94.1% and 99.0% (mean 96.3%) in experiment 2, thus confirming the adequacy of the underlying model assumption. An example of individual magnetic field waveforms grand averaged across all TS of an individual subject for experiment 1 is shown in Figure 3A. The figure demonstrates the clear P1m and N1m responses elicited by the onset of the test stimulus as well as the second N1m response peaking at around 100 ms after the stimulus change. The contour maps of Figure 3B show clear dipolar patterns over left and right hemispheres for this second N1m response. Figure 3C displays the calculated equivalent current dipoles as well as the results of sLORETA (which are overlaid onto a 3-dimensional rendering of the subject's structural magnetic resonance image) exhibiting focused neural activity in the auditory cortex. The sLORETA result completely overlapped the single dipole solution, verifying that there were no additional sources active in the brain at this time point.
The calculated means of the source strength waveforms for each hemisphere averaged across all participants are displayed in Figure 4 (for experiment 1) and Figure 5 (for experiment 2). Clear P1m, N1m, and P2m responses corresponding to the onset of the test stimulus as well as the stimulus changes were observed in both experiments. However, in the present study, we focused on the second N1m response elicited by the sound change. The goodness of fit for P1m and P2m responses was insufficient to estimate reliable sources due to small signal-to-noise ratios for the P1m and due to the previously reported uncertainty of number of source components and locations for the P2m (Godey et al. 2001). There was nearly perfect overlap of the response waveforms corresponding to the first part of the TS between spectral and temporal change conditions in both experiments 1 and 2, which had to be expected because the first parts of the TS were counterbalanced between spectral change and temporal change conditions.
The averages across the normalized N1m source strengths and latencies of both hemispheres for each condition of experiments 1 and 2 are presented in Figure 6. The repeated-measures ANOVA applied to the normalized N1m source strength revealed a significant main effect for CHANGE CONDITION (F1,12 = 4.867, P < 0.05) only in experiment 2. There was a significant interaction between CHANGE CONDITION and HEMISPHERE in both experiments (experiment 1: F1,12 = 5.176, P < 0.05; experiment 2: F1,12 = 4.854, P < 0.05): The N1m response elicited by the spectral stimulus change was larger in the right hemisphere, whereas the N1m response elicited by the temporal change was larger in the left hemisphere. The repeated-measures ANOVA applied to the N1m latency in experiments 1 and 2 revealed a significant main effect for CHANGE CONDITION (experiment 1: F1,12 = 78.11, P < 0.0001; experiment 2: F1,12 = 67.95, P < 0.0001): The N1m response elicited by the temporal stimulus change was delayed compared with the spectral change. Hence, result patterns are almost identical between experiments 1 and 2.
The results obtained in the present study clearly confirmed that the neural responses elicited by spectral versus temporal stimulus change differed between hemispheres. The N1m responses evoked by the spectral stimulus change were comparably larger in the right hemisphere, whereas the N1m responses evoked by the temporal stimulus change were larger in the left. Moreover, N1m responses elicited by the temporal stimulus change were significantly delayed compared with the spectral stimulus change condition. Crucially, and in contrast to previous studies (Zatorre and Belin 2001; Jamison et al. 2006), both the overall physical sound input and the interstimulus intervals (which evidently influence the evoked neural activity; Davis et al. 1966,Hari et al. 1982; Okamoto et al. 2004) were identical between stimulus change conditions. Therefore, onset, sustained, nor offset responses elicited by the first and second parts of the TS per se can explain the differential N1m-response pattern observed between spectral and temporal change conditions. Solely, the pattern of change from the first to the second part of the TS differed between conditions, and thus must have caused the difference in the N1m responses. Additionally, the MEG data were acquired in a silent environment, whereby a potential influence of ambient acoustic noise on test signal–related neural activity in the auditory cortex was excluded. Also noteworthy, in contrast to many previous studies, we used here simple auditory stimuli, namely pure tones with and without amplitude modulation in experiment 1 and band-pass filtered pulse trains in experiment 2. Further, the congruent results of experiments 1 and 2 indicate that the hemispheric laterality of neural responses does not depend on a specific sound type but rather on spectral and temporal variances, which are differentially encoded into neural activity in the cochlea. Therefore, our results strongly suggest that hemispheric lateralization for auditory processing is not only limited to sounds conveying meaning (e.g., speech or music) but that it at least partly originates from early basic neural processing levels dealing with the spectral and temporal features of auditory inputs (Zatorre and Belin 2001; Jamison et al. 2006).
Different neural encoders for spectral versus temporal stimulus change would result in different neural mechanisms generating the N1m response between change conditions. In the spectral change condition, spectral stimulus information was altered, but the temporal envelope remained identical between the first and second parts of the stimuli. Previous studies confirmed that N1m responses are tonotopically organized in human auditory cortex (Romani et al. 1982; Pantev et al. 1989, 1995). Hence, in the spectral change condition, neural populations activated by the first versus the second part of the auditory stimuli are located at different places on the tonotopic maps of the auditory cortex. Therefore, the N1m response elicited by the spectral change can be considered as combination of onset-evoked activity corresponding to the second frequency part of the stimulus and offset-evoked activity corresponding to the first frequency part. Liégeois-Chauvel et al. (2001) revealed by means of intracranial recordings in humans that the right hemisphere exhibits tonotopic maps, which are more clearly spectrally organized, with distinct separations between regions processing different frequencies. The left hemisphere, however, is less clearly tonotopically organized. In the present study, in the spectral change condition, the neural groups activated by the first and second stimulus parts might overlap within the tonotopic map. The neurons in such areas of overlap could not contribute to the N1m responses elicited by the spectral change because these neurons were already preactivated by the first stimulus part and hence could not be newly activated by the second stimulus part onset. Due to the clearer tonotopic organization, in the spectral change condition, neural groups activated by the first and second stimulus parts would have overlapped less in the right auditory cortex compared with the left. Therefore, in the right hemisphere, more neurons would have contributed to the generation of the second N1m, consequently resulting in relatively larger N1m responses elicited by the spectral stimulus change in the right compared with the left hemisphere.
After temporal change, the neural group that was activated within the tonotopic map remained identical to the one activated by the first stimulus part, even though the temporal envelope of neural activity (which was not tonotopically encoded because the test signals in both experiments 1 and 2 did not contain the frequencies corresponding to the temporal envelopes of the AM tone [40 Hz] and the pulse trains [32 and 48 Hz] in their spectra) altered when the sound stimulus changed from the first to the second part. Therefore, in the temporal change condition, the N1m response was probably elicited by temporal-activity pattern changes from the first to the second stimulus part within one and the same neural population. Our results indicate that these changes were dominantly processed in the left auditory cortex. Recent neuroimaging studies showing left hemispheric dominance for temporal processing support our finding: Tallal et al. (1993) argued that the processing of rapidly changing sound inputs is essential for speech signal processing, and these authors suggested left hemispheric dominance for the processing of rapidly (within milliseconds) changing auditory events. Furthermore, Belin et al. (1998) and Poeppel (2003) pointed out that the left hemisphere had better temporal resolution capabilities compared with the right, which also contributed to left hemispheric dominance for speech processing. These authors proposed that the basis for this dominance could be the application of shorter temporal integration windows (25–50 ms) in the left compared with the right hemisphere. Due to the trade-off between temporal and spectral analysis precision (Acoustic uncertainty principle; Joos 1948; Zatorre et al. 2002), high temporal resolution obtained with a short temporal integration window would result in relatively low spectral resolution, whereas high spectral resolution achieved with a long temporal integration window would result in relatively low temporal resolution. Therefore, instead of applying identical temporal integration time windows in both the left and right hemispheres, the left hemisphere would be specialized for rapid temporal processing by applying short integration time windows, at the expense of frequency resolution.
The present results indicated that the N1m-response latency elicited by the temporal change was significantly delayed compared with the spectral change. This result also supports the hypothesis that spectral and temporal changes are processed by different neural mechanisms. The spectral information would be encoded into a location code within the tonotopic maps, whereas the temporal information would be encoded into the temporal pattern (the envelope) of corresponding neural activity. The integration time windows needed for carrier frequency analysis would be much shorter compared with the windows needed for the analysis of the temporal pattern. Therefore, the spectral change could be encoded quite quickly after the onset of the second part of the test stimulus, whereas the temporal change needed a longer integration time window to be detected. The N1m latency difference between spectral and temporal change conditions strongly suggests that the underlying neural mechanisms encoding spectral versus temporal information are different and require different temporal integration windows in the human auditory cortex.
Zatorre and Belin (2001) and Jamison et al. (2006) not only showed left hemispheric dominance during temporal and right hemispheric dominance during spectral auditory processing, but they also showed that temporal variation elicited relatively more activation in primary auditory cortex and less activation in anterior superior temporal areas than spectral variation. Our present results did not indicate significant differences of estimated dipole locations of the N1m response (which is thought to originate in lateral aspects of Heschl's gyrus and the posterior temporal plane [Pantev et al. 1995; Eggermont and Ponton 2002]) between spectral and temporal change conditions. Several explanations for this inconsistency between studies are plausible. First, here we focused on neural responses evoked at specific latencies, whereas PET and fMRI (due to their lower temporal resolution capabilities) show the overall cerebral blood flow change between conditions. Second, we grand averaged across the 4 TS conditions for both spectral as well as temporal change conditions, leading to counterbalance of physical stimulus features between first and second parts of the TS. Theoretically, the N1m-response source locations might differ between different TS. However, as shown in Figure 3C, a distributed source model (sLORETA; Pascual-Marqui 2002; Pascual-Marqui et al. 2002) indicated only one single peak of neural activity in each auditory cortex and no additional sources elsewhere in the brain. Moreover, the high goodness-of-fit values and the dipolar field distributions of the contour maps confirmed that the use of the single dipole approach was appropriate.
In conclusion, using appropriately and carefully constructed auditory stimuli, the present study clearly demonstrates that spectral sound input change is dominantly processed in the right hemisphere, whereas temporal change is dominantly processed in the left hemisphere. These observations strongly support the hypothesis that the human auditory cortex of the left hemisphere has superior temporal resolution capabilities, whereas the auditory cortex of the right hemisphere has better spectral resolution capabilities. In demonstrating that laterality of function is evident at a quite basic level of auditory processing, these findings contribute to the global understanding of hemispheric lateralization of the human auditory cortex function.
“Deutsche Forschungsgemeinschaft” (Pa 392/13-1 and Pa 392/10-3); Bundes Ministerium für Bildung und Forshung/Deutsches Zemtrum für Luft und Raumfahrt grant (01GW0520).
We thank Andreas Wollbrink for technical support, Karin Berning for organizing the data acquisition, our test subjects for their diligent collaboration, and Patrick Bermudez for providing helpful comments. Conflict of Interest: None declared.