The weak field specificity and the heterogeneity of neuronal filters found in any given auditory cortex field does not substantiate the view that such fields are merely descriptive maps of sound features. But field mechanisms were previously shown to support behaviourally relevant classification of sounds. Here the prediction was tested in human auditory cortex (AC) that classification-tasks rather than the stimulus class per se determine which auditory cortex area is recruited. By presenting the same set of frequency-modulations we found that categorization of their pitch direction (rising versus falling) increased functional magnetic resonance imaging activation in right posterior AC compared with stimulus exposure and in contrast to left posterior AC dominance during categorization of their duration (short versus long). Thus, top-down influences appear to select not only auditory cortex areas but also the hemisphere for specific processing.
Specializations of auditory cortex (AC) for sounds including speech and music have been viewed as analytical dispositions for physical properties of stimuli, namely that right and left hemispheres are recruited by stimuli with high demand on spectral respectively temporal processing (Zatorre et al., 2002). However, any sound contains spectral and temporal information. The significance of one or the other for a receiver may change with conceptual listening involving e.g. discrimination or categorization. This could modify the representation of the physical properties of sounds. In animal auditory cortex (AC) electrophysiological correlates of categorization have recently emerged in right AC from experiments with frequency modulations (FM) (Ohl et al., 2001, 2003a,b). There it was shown that after an initial state of descriptive (tonotopic) FM representation AC can switch to a state in which the two categories rising and falling FM are captured independent of frequency content.
This approach was adapted to the present fMRI study with the aim of distinguishing areas of human AC in the two hemispheres that are related to such categorizations. The design in experiment I was to present the same set of linearly frequency modulated tones initially for uninformed listening and then for the task of categorizing FM direction (rising versus falling). In experiment II categorization of direction was compared with categorization of duration (long–short) of stimuli of the same set. This addressed the questions of feature-selectivity of category-dependent activation. The design with the chosen FM stimuli has two further implications, related to (i) the generally uncertain functional differences between AC areas for FM processing and (ii) the unsettled question of hemispheric specialization for FM as features of speech prosodies.
Single unit mapping results in animal AC have left an explanatory gap as to functions of different fields. Fields contain types of neuronal filters tuned to various spectral and temporal sound features. The specialization of any member of a type covers only a restricted range of a feature dimension. The prevailing (bottom up) concept derived from such properties views fields as descriptive maps of stimulus parameters sufficiently challenged experimentally at the neuronal level by simple exposure to sounds and even in an anesthetized state (Clarey et al., 1992; Suga, 1994; Ehret, 1997; Eggermont, 1998; Rauschecker, 1998; Kaas et al., 1999; Schreiner et al., 2000; Nelken, 2002; Read et al., 2002). But except for highly specialized animals (Stiebler, 1987; Suga, 1994) to date a given neuronal selectivity described by receptive field organization and stimulus response transfer functions has turned out to be only one of many heterogeneous selectivities in a field and the same selectivities may be found across various fields with merely quantitative differences. This has previously led to the synopsis that ‘similarities of AC fields outweigh differences’ (Eggermont, 1998).
Frequency modulations (FM), which are addressed in this study, are a particularly salient case as they activate large populations of neurons in primary and secondary AC fields. Comparisons of a few fields have revealed differences of spatial distribution of FM responsive clusters of neurons but only small differences of field selectivity for features like direction and steepness of modulation (Whitfield and Evans, 1965; Mendelson and Cynader, 1985; Phillips et al., 1985; Heil et al., 1992a,b; Mendelson and Grasse, 1992; Mendelson et al., 1993; Shamma et al., 1993; Tian and Rauschecker, 1994, 1998; Gaese and Ostwald, 1995; Kowalski et al., 1995; Schulze et al., 1997; Heil and Irvine, 1998; Horikawa et al., 1998; Ricketts et al., 1998; Nelken and Versnel, 2000; Ohl et al., 2000; Zhang et al., 2003).
It does not necessarily follow from these electrophysiological results that imaging studies of human AC using sound exposure without tasks also fail to identify area-specific activations. They measure population responses of neurons in both hemispheres and especially with variation of sound features they may detect differences in stimulus responsiveness not obvious from the statistics of single unit mapping. Several corresponding studies on differential stimulus responsiveness of AC areas have been published (e.g. Baumgart et al., 1999; Belin et al., 2000; Binder et al., 2000; Wessinger et al., 2001; Zatorre and Belin, 2001; Hall et al., 2002; Harms and Melcher, 2002; Hart et al., 2002; Patterson et al., 2002; Warren et al., 2002), and one found right–left AC differences for level-dependent processing of FM sweeps (Brechmann et al., 2002). However, these results do not allow a functional interpretation in terms of conceptual listening tasks which may change the representations obtained with sound exposure. The present experiment was designed to clarify this issue.
An important aspect of the selected stimulus class of FM sweeps may be its relevance for human communication. Voice fundamental frequency (F0) changes independent of individual voice is a key to discrimination of prosodies in speech and of melody contours in song (Frick, 1985; Scherer, 1995; Banse and Scherer, 1996). Even though linear FM are highly impoverished models of these voiced sounds, the direction of pitch change may represent a crucial parameter. This view is supported by a study (Divenyi and Robinson, 1989) that showed that patients with right brain damage not only had prosodic deficits but also showed impairments in directional discrimination of linear FM stimuli. Furthermore, animal studies showed that directional discrimination of FM is dependent on an intact auditory cortex (Ohl et al., 1999; Harrington et al., 2001) and relies predominantly on the right AC (Wetzel et al., 1998). Therefore we expected right AC regions to be specifically involved in directional categorization of FM.
Materials and Methods
In experiment I 10 females and eight males (20–36 years old, mean age 24.7) and in experiment II eight females and eight males (20–39 years old, mean age 25.5) were scanned. One subject of experiment I repeated the task six times over a period of two month. None of the subjects of experiment II participated in experiment I. All subjects were right handed (Edinburgh Handedness Inventory) with normal hearing, and had given written informed consent to the study which was approved by the ethical committee of the University of Magdeburg.
For experiment I two sets of 40 linearly frequency modulated (FM) tones were used as stimuli. Each covered half an octave in 0.5 s of either low frequency range (0.1–0.15 kHz, 0.2–0.3 kHz in steps of 100 Hz up to 2–3 kHz and the reverse) covering frequency changes in speech or high frequency range (5.0–7.5 kHz, 5.1–7.65 kHz in steps of 100 Hz up to 6.9–10.35 kHz and the reverse) (Fig. 1). Each block consisted of 45 randomized FM (repetition rate 1 Hz) of either the low or high frequency range. One experimental session consisted of 16 alternating ‘silence’ and stimulus blocks of 45 s, resulting in 12 min total duration. During the first part of the session, subjects were instructed to attend to the stimuli with no specific task (uninformed listening). During the second part, immediately following the first, subjects had to categorize upward and downward FM presented with a target rate of 50%. A group of nine subjects had to press a mouse button indicating downward FM and a group of eight subjects indicating upward FM. For a correct response, subjects had to press within 0.5 s after FM offset. In two subjects these behavioural responses were not recorded due to technical failure.
For experiment II a set of 32 FM of either 0.4 or 0.6 s duration served as stimuli. The centre-frequency (Fc) varied between 1 and 3.2 kHz in steps of 100 Hz. To achieve the same speed of modulation for 0.4 ms and 0.6 s FM of the same centre-frequency the starting and end frequencies were calculated by Fc ± Fc/2 × FM duration. Each block of 46 randomized FM contained 23 upward FM and 23 downward FM, half of which had 0.4 s duration. The interval between two FM was 500 ms. One experimental session consisted of 20 alternating ‘silence’ and stimulus blocks of 46 s, resulting in 15 min 20 s total duration. During one half of the session subjects had to press a mouse button indicating 0.4 ms FM and during the other half they had to indicate downward FM. For a correct response, subjects had to press within 0.5 s after FM-offset. In one subject these behavioural responses could not be recorded due to technical failure. Task sequence was counterbalanced between subjects and the subjects were not informed about a change of instruction. The digitized instructions (8 s each) were inserted at the beginning of the first and second half of the stimulus file.
The FM of both experiments had linear ramps of 10 ms and were presented with a sound pressure level of 70 ± 5 dB via fMRI-compatible electrodynamic headphones equipped with ear-muffs for further reduction of residual background noise (Baumgart et al., 1998).
Low-noise-fMRI experiments were carried out on a BRUKER 3T/60 head scanner equipped with a quadrupolar birdcage head-coil. Pilot scans were used for orientation of contiguous slices covering the superior temporal plane in both hemispheres by following the course of the sylvian fissure on both sides as closely as possible. Functional images were acquired by using a conventional FLASH sequence which was modified by using long gradient-ramp rise-time (2500 μs) which reduced the scanner noise to 54 dB (A) peak amplitude (for details of scanner noise-measurement see Brechmann et al., 2002). The beginning of each stimulus and ‘silence’ block coincided with the acquisition of a volume. In contrast to EPI sequences, which acquire a complete functional image during each repetition, the gradient echo sequence acquires one line in k-space during one repetition (TR). High T1-contrast imaging was used to obtain anatomical landmarks and immediately followed the fMRI. The subject's head was fixated with a vacuum cushion with attached ear muffs containing the fMRI compatible headphones.
In experiment I 80 functional volumes each consisting of four 8 mm slices were collected in two 12 min runs (TE = 30.7 ms; TR = 167 ms; flip angle = 15°; FOV = 18 cm). One slice was positioned above the sylvian fissure. In experiment II 120 functional volumes each consisting of three 6 mm slices were collected in 15 min 20 s (TE = 30.7 ms; TR = 127.9 ms; flip angle = 15°; FOV = 18 cm).
Each functional dataset was subjected to a quality check: first, subjects head movement was monitored using the AIR package (Woods et al., 1998). In case of a continuous movement of >1 mm or rotation of >1° in any direction, the whole dataset is discarded (this was not the case in the two experiments). Second, the mean grey value of all functional volumes (obtained in the temporal lobe of two slices) was analysed because transient head movements can lead to large deviations in the mean grey value. Single images with grey-value deviations of >2.5% were excluded from further analysis. In case of exclusion of more than two images of one stimulus- or silence condition, the whole functional dataset is discarded (this was not the case in the two experiments). After motion detection, image matrix size was increased to 128 × 128 by pixel replication followed by smoothing with a Gaussian filter [full-width half-maximum = 2 voxel (2.8 mm), Kernel = 5 voxel]. Functional activation was assessed by linear vector-space analysis (Bandettini et al., 1993). A simple trapezoid function served as correlation vector, roughly modelling the expected BOLD response. The first image of each stimulus and ‘silence’ block was set to half-maximum values. This takes into account that the full development of the BOLD response and the return to baseline takes a few seconds. The remaining images acquired during silence were set to minimum values and the remaining images acquired during stimulus periods were set to maximum values. In the second experiment each two images during which the instructions were presented were excluded from analysis.
The activation of auditory cortex was scrutinized with BrainVoyager2000™ in a three-dimensional analysis of both hemispheres of individual brains in relation to the prominent anatomical landmarks insular sulcus, first transverse sulcus, Heschl's sulcus and superior temporal sulcus (Fig. 1). This served to determine the continuity of activation patterns across slices and the regional parcellation of AC activation. Generally, activation patterns appeared as more or less separate clusters of activated voxels in different regions of AC (Figs 1 and 3). In both experiments, activated voxels (P < 0.05) in each slice were attributed to one of the four territories TA, T1, T2 and T3 defined as previously described (Brechmann et al., 2002). In short, territory T1 is located on the antero-medial rim of Heschl's gyrus (red activation) and presumably covers core fields including primary AC. Due to a lack of further reliable anatomical landmarks or functional borders as demonstrated in a 7 T fMRI study (Formisano et al., 2003) we could not subdivide T1 into a medial and a lateral aspect. T2 was centred on Heschl's sulcus (green) and presumably comprises the lateral belt areas with secondary AC fields. We find the stripe-like clusters of activation in T1 and T2 to be spatially separated, which has also been shown by others (Dhankhar et al., 1997; Hashimoto et al., 2000; Di Salle et al., 2001; Schönwiesner et al., 2002; Formisano et al., 2003). T3 covers the posterior part and lateral surface areas of the planum temporale (blue). TA (yellow) was defined to be anterior to the first transverse sulcus on the planum polare (presumably medial belt areas).
For each of these territories the intensity weighted volume (IWV), i.e. the number of activated voxels multiplied by their average BOLD signal intensity was determined.
In addition to this region of interest approach, we analysed the grand average activation in experiment II in using BrainVoyager2000™. Because the angle between the AC–PC plane and the course of the sylvian fissure vary considerably from subject to subject (between 17 and 30° in our sample), we chose the following approach of averaging. First, the brains of all subjects were Talairach transformed. Then we chose the brain of one subject which best fits the standard Talairach brain with respect to the angle between the AC–PC plane and the sylvian fissure (24°). All other brains were adjusted to this reference brain by rotation and translation such that the sylvian fissures and the medial base of Heschl's gyrus of all brains superimposed as closely as possible. Then we compared activation for categorization of FM direction versus categorization of FM-duration and the reverse and determined the Talairach coordinates of each peak activation in AC.
Laterality index of activation (IWV) in the combined territories of the auditory cortex (AC) was determined as activation of right AC minus left AC divided by the total activation in both minus AC. During the categorization tasks subjects responses were recorded and analysed using the signal detection theory (Swets et al., 1961) which resulted in a sensitivity index d′ for each subject. To test whether the subjects were able to perform the task above chance at the 1% level the subject's responses were subjected to a χ2-test separately for each frequency range.
Behavioural Results of Experiment I
The performance of subjects during categorization of rising and falling FM showed a large range of variation for the low and the high frequency range. For the high frequency range three subjects performed the task below chance level one of which also performed below chance for the low frequency range (χ2-test, P < 0.01).
A t-test on task performance (d′) for the two frequency ranges revealed that the direction of FM in the high frequency range was better categorized than FM direction in the low frequency range (P < 0.05). This was independent of the fact that three subjects in the high frequency range performed below chance compared with one in the low frequency range.
fMRI Activation in Experiment I
The global AC activation in terms of intensity weighted volume (IWV) in the two hemispheres showed no significant side differences in the two conditions, sound exposure versus rest and directional categorization versus rest (t-test of laterality index, P > 0.1). Similarly, an analysis of variance revealed no significant interaction between hemisphere and experimental condition (P > 0.2). These group results do not reflect that some individuals showed marked asymmetries of activation, namely the possibility to switch from left dominant activation during exposure to right dominant activation during categorization of the same stimuli (Fig. 2).
Comparing AC-activation in each hemisphere between the two conditions revealed a significant effect of experimental condition in right AC independent of the frequency range with stronger activation during the directional categorization (ANOVA: F(1,16)=10.23, P < 0.01) (Fig. 3A). Furthermore, variances of activation during both conditions were smaller in the right hemisphere.
These on average moderate hemispheric effects became more salient upon analysis of regional differences in AC. We previously found that imaging parallel to the Sylvian fissure reveals clustering of AC activation which bears some relationship to anatomical landmarks and known cytoarchitectural subdivisions of AC (Scheich et al., 1998; Brechmann et al., 2002). This permitted to attribute the separate clusters of activation in each individual to one of four territories (regions of interest, ROI) defined by the landmarks (see Methods and colorations in Fig. 2).
For a separate analysis of task-effects in each of these ROIs intensity weighted volume (IWV) was subjected to an analysis of variance covering experimental condition (uninformed listening, categorization of FM direction) and frequency range.
Main effects of experimental condition with stronger activation during directional categorization were found in the secondary posterior territories of right AC (T2: F(1,16) = 9.33, P = 0.008; T3: F(1,16) = 9.83, P = 0.006) (Fig. 4A) independent of frequency range. Main effects of frequency range were found in all AC-territories of the right hemisphere with stronger activation for low frequencies (TA: F(1,16) = 9.53, P = 0.007; T1: F(1,16) = 6.98, P = 0.018; T2: F(1,16) = 24.22, P < 0.001; T3: F(1,16) = 9.33, P = 0.008). There was no significant interaction of factors task×frequency range (P > 0.1).
Since performance of subjects varied we analysed Pearson correlations between activation in the different ROIs and sensitivity index d′ for each frequency range excluding subjects with below chance performance (see above). Such correlations for both frequency ranges were significant exclusively in the right T3 (low FM: r = −0.618; P = 0.018; high FM: r = −0.800; P = 0.002). This inverse correlation of activity in right T3 and performance (Fig. 5) also holds for an individual subject who performed the task six times.
In summary, the results show that the right AC and especially secondary posterior areas became more strongly involved when FM stimuli were categorized according to direction. But with respect to specificity of the task the results do not exclude that any other categorization involving these stimuli would lead to similar results.
To test for specificity experiment II with different subjects used similar FM stimulus material which allowed to perform two different tasks with the same material namely directional categorization and categorization of the duration of FM.
Behavioural Results of Experiment II
Behavioural data of subjects during categorization of FM direction and FM-duration for both tasks were subjected to a χ2-test. This revealed that three subjects in the durational categorization and two different subjects in the directional categorization performed the task below chance level (P < 0.01).
Analysis of variance of performance (d′) with factors task and task sequence revealed no significant effects.
fMRI Activation in Experiment II
In some individuals the global activation patterns changed from a left lateralization during categorization of FM-duration to a right lateralized activation during categorization of FM direction. A difference in laterality was significant as a group effect as revealed by a t-test of laterality-indices of global left and right AC activation (P < 0.05) with a stronger right lateralization during categorization of FM direction. Comparing AC-activation in each hemisphere between the two conditions revealed a significant effect of experimental condition in right AC independent of the task sequence with stronger activation during the directional categorization [analysis of variance (ANOVA): F(1,14) = 8.55, P < 0.01; Fig. 3B].
Activation in each territory was subjected to an ANOVA with factors task, task sequence, and hemisphere. Main effects of task [F(1,14) = 10.01, P = 0.008] as well as hemisphere [F(1,14) = 6.91, P = 0.022] showed a stronger activation during directional categorization compared with durational categorization in the primary area T1 of the right hemisphere. Interaction between factors task and hemisphere was significant in T3 [F(1,14) = 11.69, P = 0.005] with a stronger activation during categorization of direction over categorization of duration for right T3 (t-test, P = 0.003) and the converse for left T3 (t-test, P = 0.01; Fig. 4B). This double dissociation of the activation in T3 which covers the posterior part of Brodmann area (BA) 22 could be replicated in the grand average analysis with a direct activation contrast between the two tasks. It showed stronger activation for the durational vs directional categorization in left posterior BA 22 (Talairach coordinates: −59, −40, 14) and stronger activation for the directional versus durational categorization in right posterior BA 22 (60, −35, 11) (Fig. 6). Both coordinates are far posterior to Heschl's sulcus of the Talairach brain and thus fall into the T3 area.
Pearson correlations between d′ for the two tasks and the respective activation were not significant when including all subjects who performed the tasks above chance. This may partially be explained by a different design of experiment II with a sequence of two categorical tasks. To exclude a confounding sequential effect a correlation was calculated for those subjects alone who performed the directional categorization task first, like in experiment I. Even for this small number of six subjects the correlation approached significance (r = −0.718; P = 0.054).
Categorization Tasks Modulate Activation by Stimulus Exposure
When sets of objects are categorized abstract features are derived from similarity relationships of their physical properties (Knowlton, 1999). This conceptualization of key features can be performed in spite of variation of other features. Even though categorizations of objects as a function of semantic content require more associative brain areas (e.g. Gauthier et al., 2000; Adams and Janata, 2002) there is evidence that basic categorical operations on the feature level involve mechanisms in sensory cortex (Knowlton and Squire, 1993; Knowlton, 1999).
Our results support the initial hypothesis that a conceptual listening task such as categorization of FM features rather than the exposure to the sound material per se can recruit mechanisms in particular areas of the AC which make them functionally distinct. Exposure to the material strongly activated multiple primary and secondary areas on the dorsal surface of the temporal lobe. Categorization of FM direction independent of the frequency range increased activation over mere exposure only in right posterior AC areas (T2, T3). Such a task-related increase in activation in secondary areas compared with mere exposure was previously shown with functional imaging in human visual cortex (e.g. with moving visual stimuli: O'Craven et al., 1997; Büchel et al., 1998; Huk and Heeger, 2000). With respect to functional specificity, these findings in some visual cortex fields can be readily interpreted because their rather homogeneous and unique stimulus responsiveness of neurons are well known from animal electrophysiological experiments. The interpretation of similar task-related increase in activation in areas of the auditory cortex is less clear because neuronal populations within fields seem to be more heterogeneous and our results with mere exposure to the FM did not reveal any conspicuous local effects. There is only an indirect argument that the result of experiment I can be interpreted in favour of right posterior AC specialization for FM direction. Namely, the same right AC territories do not systematically change activation with increasing stimulus-level of FM-sweeps with varied direction (Brechmann et al., 2002). This level-tolerance is considered as a prerequisite of selectivity for a particular information bearing sound element because biologically important sounds must be interpreted independent of sound pressure levels (Suga, 1994).
The double categorization experiment II provides direct support for a specific involvement of right posterior AC in the processing of FM direction. Categorization of direction increased activation in right T3 over categorization of duration and categorization of duration increased activation in left T3 over categorization of direction. Consequently, this double dissociation suggests that it is the task of categorizing one or the other feature of FM and not any selective representation of the class of FM sweeps per se in the right posterior AC.
Similar differential task effects have been obtained in human visual cortex when subjects were instructed to discriminate stimulus speed which increased activation in MT/V5 compared with making judgements about colour of the same stimuli which increased activation in V4 (Corbetta et al., 1990; Chawla et al., 1999).
Prior studies showing hemispheric activity changes depending on the task (Zatorre et al., 1992; Fink et al., 1996; Stephan et al., 2003) reported primarily modulation of frontal or parietal areas possibly involved in top-down control (Corbetta and Shulman, 2002). Such areas were not sampled in our study, and therefore top-down influences arising from such regions could not be evaluated. However, the present results show modulations at earlier levels of information processing.
Neural Correlates of Task-performance
The argument most convincingly supporting the role of right T3 in directional categorization of FM is the correlation of its activation with task-performance. The lowest amount of activation in right T3 was found in subjects most proficient in the task and in the repeatedly scanned subject in the sessions with highest performance. Such dependences are considered good examples of criteria for a link between neural activity and perception (Parker and Newsome, 1998). The sign of the correlation, namely, that better performance relies on a lower amount of activation, has previously been described mainly for more associative brain areas (Gerlach et al., 1999; Buckner et al., 2000; Sunaert et al., 2000; Chein and Fiez, 2001; Adams and Janata, 2002; Gruber et al., 2002). A theoretical concept (Desimone, 1992) is that the specificity of stimulus representation can be improved by suppressing neurons which have a low stimulus selectivity. In this sense our initial tenet that fields contain mixtures of neurons serving descriptions of different stimulus features might be relevant. With good performance the activation of neurons representing task-irrelevant features of the FM like frequency range, steepness and duration might be reduced in favour of those representing the task-critical feature, namely FM direction. The lower amount of activation with better performance does not contradict experiments which show enlargements of cortical representations with relevant training (Recanzone et al., 1992, 1993). Training involves the sharpening of discrimination within the trained stimulus dimension and therefore seems to recruit more neurons, whereas in categorization the similarities of stimuli are emphasized over differences (Knowlton, 1999; Ohl et al., 2001).
The fact that the clearest evidence for a specific involvement in the categorization of FM direction has been found for right T3 does not exclude that more primary areas of the right AC might also be involved. A task-related increase in activation was also shown in the right T2 in experiment 1 and could be strong enough in individuals to shift activation even of primary AC (T1) to right dominance (Fig. 2). Thus it is possible that more primary areas of right AC are involved in the processing of FM direction.
These results are in line with the concept that right auditory cortex may be dominant in determining the direction of pitch changes. This was found by unilateral AC lesion experiments in a rodent with human-like hearing range, the Mongolian gerbil, using discrimination of FM sweeps (Wetzel et al., 1998), and in patients with right anterior temporal lobectomies extending into Heschl's gyrus using tone steps (Johnsrude et al., 2000). The similarity across species is a strong argument that directional pitch change discrimination is an old aspect of right hemisphere specialization. At the same time the similarity across FM and tone steps support our hypothesis that this specialization is at the conceptual level and not necessarily stimulus-specific.
FM-related Processing in Human AC
Only a few lines of research have addressed processing of FM direction in human AC one of which used mismatch negativity (MMN). Rare directional reversals of a repeated linear sweep caused MMN for both upward and downward reversals (Sams and Näätänen, 1991). Mixtures of sweeps with different centre-frequencies but the same FM direction resulted in MMN to directional reversal of any FM (Pardo and Sams, 1993). This could be evidence of a directional categorization effect, but it remains unclear how the MMN, which requires no attention to stimuli, relates to our results, especially since only right AC was systematically studied.
A second line of research used speech-related formant transitions (spectral motion), the exposure to which led to more pronounced PET activation bilaterally in caudo-lateral areas of human AC in comparison to stationary spectral formants (Thivard et al., 2000). There was bilateral activation by slow transitions but reduced right AC activation by fast formant transitions (Belin et al., 1998). Since these formant transitions were both short (40 ms) and long (200 ms) but also steep (33.3 octaves/s) and shallow (6.6 octaves/s), these findings allow several interpretations but may be compatible with our results assuming that the left AC harbours mechanisms to distinguish FM-durations. The hypothesis that right AC is specialized for relatively slow spectral changes was followed in a fMRI study (Müller et al., 2001) which presented slow FM (200 ms, 2.5 octaves/s) with a directional discrimination task, but they found bilateral activation of AC (BA 22). Their failure to find evidence supporting their hypothesis might be because they only used two FM stimuli (one rising and one falling) and thus there was no categorization task which also led to a very high task performance.
Implications for Prosody Processing
The categorization-dependent lateralized and territory-specific activation of AC and especially the performance-correlated activation in right T3 may also explain mechanistic aspects of prosody perception in speech.
Key features of prosody are the direction of frequency modulations in the voice fundamental and variations of syllable duration and intensity (Fry, 1955; Frick, 1985; Scherer, 1995; Banse and Scherer, 1996). Since these features must be determined independently of individual voice, this task has an inherent aspect of categorization.
Based mainly on prosodic deficits frequently found in right brain damaged patients, it has long been hypothesized that the right temporo-parietal cortex is the substrate of prosody processing. However, the interpretation of these data is controversial (for reviews see Joanette et al., 1990; Ackermann et al., 1993; Baum and Pell, 1999; Myers, 1999). There are two possible main reasons for this. First, there may have been confounding or competing effects of linguistic content on cortex activation in part of the subjects. Supporting this view, several studies showed that right lateralized processing of prosody is more evident when using low pass filtered speech compared with normal speech (Blumstein and Cooper, 1974; Behrens, 1985; Perkins et al., 1996; Meyer et al., 2002).
A second reason might be that not all of the acoustic variables characterizing prosodies are processed in the same hemisphere (Blumstein and Cooper, 1974; Klouda et al., 1988; van Lancker and Sidtis, 1992).
Our results with FM as a simple model of F0 contours in prosodies suggest that essential variables are indeed processed in different hemispheres, the direction dominantly in the right and duration in the left AC. Since there was no correlation of activation with performance in the durational categorization task the evidence for the specific involvement of left T3 is weaker than for the involvement of right T3 in directional categorization. Therefore the possibility cannot be excluded that the chosen durations of FM in the durational categorization did not provide an optimal contrast between short and long. As a trend this task was also more difficult than the directional categorization task.
Generally, our results are consistent with suggestions of right and left auditory cortex function by Zatorre et al. (2002) that the left AC is relatively specialized for temporal processing whereas the right AC is specialized for spectral processing. Our results extend this stimulus-related hypothesis to top-down mechanisms of tasks. Top-down influences appear to select the hemisphere specifically involved in processing of stimuli containing both spectral (FM direction) and temporal (FM-duration) information depending on the task that has to be solved.
The research described in this paper was supported by SFB 426, Deutsche Forschungsgemeinschaft.