Abstract

Neurophysiological measures indicate cortical sensitivity to speech sounds by 150 ms after stimulus onset. In this time window dyslexic subjects start to show abnormal cortical processing. We investigated whether phonetic analysis is reflected in the robust auditory cortical activation at ∼100 ms (N100m), and whether dyslexic subjects show abnormal N100m responses to speech or nonspeech sounds. We used magnetoencephalography to record auditory responses of 10 normally reading and 10 dyslexic adults. The speech stimuli were synthetic Finnish speech sounds (/a/, /u/, /pa/, /ka/). The nonspeech stimuli were complex nonspeech sounds and simple sine wave tones, composed of the F1+F2+F3 and F2 formant frequencies of the speech sounds, respectively. All sounds evoked a prominent N100m response in the bilateral auditory cortices. The N100m activation was stronger to speech than nonspeech sounds in the left but not in the right auditory cortex, in both subject groups. The leftward shift of hemispheric balance for speech sounds is likely to reflect analysis at the phonetic level. In dyslexic subjects the overall interhemispheric amplitude balance and timing were altered for all sound types alike. Dyslexic individuals thus seem to have an unusual cortical organization of general auditory processing in the time window of speech-sensitive analysis.

Introduction

Speech signal is constructed of a complex set of acoustic features, such as frequency range, amplitude, duration of signal and pauses, and rapid changes in spectrum. Phonetic features must be extracted from this acoustic signal in order to proceed to phonological and finally to semantic analysis. There is evidence for multiple representations and processing stages in the analysis of speech sounds in the human brain (for discussion, see Phillips, 2001), but it remains unsettled where and at what time window speech-specific information is extracted.

During the past decade, cortical areas specifically involved in speech sound analysis have been explored using functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). Speech stimuli have been shown to evoke more widespread activation than nonspeech stimuli in the superior temporal cortex bilaterally or with slight left-hemisphere predominance (Demonet et al., 1992; Zatorre et al., 1992; Binder et al., 1994; Vouloumanos et al., 2001). When searching for the neural basis of phonetic processing, it is crucial to contrast speech sounds with acoustically comparable sounds to exclude the possibility of finding differences only based on complexity. Contrasting phonetic versus acoustic analysis has revealed activation in the left superior and middle temporal gyri (STG and MTG) and superior temporal sulcus (STS) (Binder et al., 2000; Benson et al., 2001; Vouloumanos et al., 2001).

Identification of the cortical loci selectively activated by speech sounds, however, provides only partial information. Speech perception is a very fast process — the signal is transformed from acoustic features to meaning within fractions of a second. Thus, especially for the early steps in the analysis of speech signal, it is likely that the neural representations of different stages and transformations are activated very briefly. The time course of auditory processing can be followed using neurophysiological measures, electroencephalography (EEG) and magnetoencephalography (MEG).

Semantic processing of spoken language starts around 200–300 ms after sound onset, as demonstrated, e.g. by studies using sentences with semantically congruent or incongruent final words (cf. Connolly et al., 1994; Helenius et al., 2002b). Phonetic/phonological information must thus be accessible by this time. Within the first 200 ms, speech-specificity has been tested using oddball paradigms. In these setups, frequent (standard) stimuli are interspersed with infrequent (deviant) stimuli. The difference between the responses to deviant and standard stimuli in auditory cortex is known as the mismatch response, or mismatch negativity (MMN) in EEG literature (Näätänen, 1992; Alho, 1995). The MMN typically reaches the maximum at ∼150 ms after stimulus onset. It is seen as a reflection of auditory sensory memory at the neuronal level. MMN behaves differently for speech and nonspeech stimuli (Aulanko et al., 1993; Phillips et al., 2000; Shtyrov et al., 2000; Vihla et al., 2000). Moreover, MMN responses to phoneme contrasts in the native language are stronger than those to non-native contrasts (Näätänen et al., 1997). Phonetic representation of the speech sound must thus be available at this time window to enable memory traces based on phonetic (or phonological) labels.

Whether speech-specific analysis is reflected in the neural processing before MMN time window is currently not established. The MMN signal is preceded by a robust activation of the auditory cortex at about 100 ms after sound onset, referred to as the N100m (or N100 in EEG literature). Some studies suggest phonetic/phonological effects in this response but others not (Kuriki and Murase, 1989; Eulitz et al., 1995; Gootjes et al., 1999; Tiitinen et al., 1999). Gootjes et al. (1999) found significantly stronger N100m responses to vowels than to tones or piano notes over the left but not the right hemisphere. However, Eulitz et al. (1995) and Tiitinen et al. (1999) found no significant difference in the strength of the N100m response to speech and tone stimuli, although the N100m response was slightly later for speech sounds than for tones, in both hemispheres. The variability of the results is likely to be largely due to variability of the stimulus materials. In many of these studies, the main research question did not require careful acoustic matching of the speech and nonspeech stimuli, or it was not attempted. Thus, results differing for speech versus nonspeech sounds may reflect acoustic variation rather than sensitivity to speech sounds per se. It is also worth noting that in any single study the stimuli have typically been sounds with stable frequencies (i.e. vowel type sounds) (Eulitz et al., 1995; Tiitinen et al., 1999; Vihla and Salmelin, 2003) or transition sounds (i.e. CV-syllable type of sounds) (Shtyrov et al., 2000), but not both. As natural language is a mixture of these sound types, it may be important to allow acoustic variation among the speech stimuli when evaluating cortical analysis of speech versus nonspeech sounds.

Characterization of the time windows and hemispheric balance of acoustic and phonetic/phonological analysis is essential not only for understanding normal speech perception but also for understanding the neural basis of dyslexia. Dyslexic individuals are known to have problems in tasks requiring auditory phonetic analysis (Bradley and Bryant, 1983; Shankweiler et al., 1995). At the neuronal level, dyslexic subjects show delayed semantic processing at 300–400 ms post-stimulus (Helenius et al., 2002b), and abnormalities in the preceding MMN response (Baldeweg et al., 1999; Schulte-Körne et al., 2000) and N100m response (Helenius et al., 2002b). These findings clearly point to problems within the first 200 ms after speech onset. It would be tempting to interpret the unusual cortical activation patterns in the dyslexic subjects as signatures of their known phonological problems but, obviously, they could equally well be associated with abnormalities in basic acoustic processing. The functional role of the N100m time window in speech versus nonspeech analysis is thus a pressing issue in dyslexia research, as well.

In the present study, we used whole-head MEG to focus on the role of the N100m auditory cortical response in acoustic and phonetic processing. First, we investigated whether the N100m response is sensitive to speech in a normal subject population, i.e. whether the strength or timing of the neural response differ between speech and nonspeech sounds. Our speech stimuli were two synthetic vowels and consonant–vowel syllables. The nonspeech stimuli were complex sounds and simple sine wave tones that were spectrally and temporally carefully matched with the speech stimuli. Second, we tested these same speech and nonspeech stimuli on a group of dyslexic individuals to investigate whether they show deviation from the response pattern seen in controls either for all sound types or specifically for speech sounds.

Materials and Methods

Stimuli

The stimuli were synthetic speech sounds, complex nonspeech sounds and simple sine wave tones (Fig. 1). The duration of all stimuli was 150 ms. The speech sounds were Finnish vowels (V; /a/, /u/) and consonant–vowel syllables (CV; /pa/, /ka/) created using a Klatt synthesizer (Klatt, 1980) for Macintosh (Sensimetrics, Cambridge, MA, USA). The fundamental frequency (F0) decreased steadily from 118 to 90 Hz, resembling a normal male voice. The formant frequencies F1, F2 and F3 for vowel /a/ were 700, 1130 and 2500 Hz, and for vowel /u/, 340, 600 and 2500 Hz, respectively. These values were based on studies of Finnish speech sounds and formant structure (Wiik, 1965; Iivonen and Laukkanen, 1993) and subjective evaluation of vowel and consonant quality and intelligibility. The formant bandwidths in both vowels were 90 Hz for F1, 100 Hz for F2 and 60 Hz for F3. The vowel envelopes had 15 ms fade-in and fade-out periods.

Figure 1.

Schematic illustration of the frequency composition in the different stimulus types (speech, complex sound and sine wave tone) for a steady-state sound (/a/ and its nonspeech equivalents) and a transition sound (/pa/ and its nonspeech equivalents). The horizontal lines represent the different frequency components (or formants, F) and the vertical dashed lines represent the end of the transition period in transition sounds (at 35 ms).

Figure 1.

Schematic illustration of the frequency composition in the different stimulus types (speech, complex sound and sine wave tone) for a steady-state sound (/a/ and its nonspeech equivalents) and a transition sound (/pa/ and its nonspeech equivalents). The horizontal lines represent the different frequency components (or formants, F) and the vertical dashed lines represent the end of the transition period in transition sounds (at 35 ms).

The CV-syllables started with a 35 ms frequency transition where F1, F2 and F3 frequencies linearly changed from 503 to 700 Hz, 858 to 1130 Hz and 2029 to 2500 Hz for /pa/, and from 503 to 700 Hz, 1402 to 1130 Hz and 2029 to 2500 Hz for /ka/. The initial transition was followed by a 115 ms steady-state period where the formant frequencies were identical to vowel /a/. The /pa/ and /ka/ sounds thus differed only by the direction of change in F2. To obtain a natural sounding stop consonant, the stimuli began with a 4 ms burst of frication. Aspiration was added from 1 ms onwards, decreasing smoothly during the 150 ms duration of the stimuli. The envelopes of the CV stimuli were similar to those of the vowels except for the beginning where the voicing started at 5 ms and the fade-in period was more rapid.

The nonspeech stimuli were created in Sound Edit (MacroMedia, San Francisco, CA, USA). They were simple sine wave tones and complex sounds combined from three sine wave tone components of exactly the same frequency as the formants of each of the four speech sounds. To retain the transition difference between /pa/ and /ka/ also in the sine wave tones, these stimuli were composed of the F2 frequency of each speech sound. The envelopes of the nonspeech sounds were similar to the speech sounds including 15 ms fade-in and fade-out periods and a slope fade-in for the nonspeech equivalents of the CV stimuli. Although acoustically carefully matched, none of the nonspeech sounds were perceived as speech sounds.

The amplitudes of the different sounds were adjusted with elongated versions of the original sounds so that at the end of the sound delivery system, measured with artificial ear and spectrum analyzer calibrated to ear sensitivity, the sound amplitudes differed by <2 dB (SPL).

Subjects

Subjects were 10 normally reading adults (23–39 years; five females) and 10 adults with developmental dyslexia (20–39 years; five females). The subjects gave their informed consent to participate in the study. They were native Finnish speakers, right-handed (except for one control subject), and had no history of hearing loss or neurological abnormalities. The dyslexic adults were selected on the basis of self-reported early history of reading problems. They had all been tested for dyslexia or had received special tutoring for reading difficulties during their school years. The average education level of the control (14 years) and dyslexic groups (13 years) was similar.

Behavioral Tests

The dyslexic subjects were tested for general linguistic and non-linguistic abilities using a subset of the standardized Finnish version of the Wechsler Adult Intelligence Scale - Revised (WAIS-R) and of Wechsler Memory Scale - Revised (WMS-R) (Vocabulary, Comprehension, Similarities, Block Design, Digit Span, Visual Span) tests (Wechsler, 1981, 1987; Woods et al., 1998a; Woods et al., 1998b). The reading and naming speed of dyslexic subjects were measured as well. Reduced reading speed (Leinonen et al., 2001) and naming speed (Wolf and Obregon, 1992) have been found to be reliable markers for dyslexia. In the Oral Reading test subjects were asked to read aloud a narrative printed on a sheet of paper. The reading speed was measured as words per minute. In the Rapid Automatized Naming test (Denckla and Rudel, 1976) and in the Rapid Alternating Stimulus naming test (Wolf, 1986) subjects were asked to name a 5 × 10 matrix of colors, numbers and letters and the naming speed was measured. The results of these tests were compared against norm data of 38 (Oral Reading, RAS) and 15 (RAN) normally reading subjects.

In addition, the following auditorily presented phonological tests were administered. In the Phoneme Deletion test (Leinonen et al., 2001) 16 words with 4–10 letters and with 2–4 syllables were presented via headphones. Subjects were asked to pronounce each stimulus without the second phoneme (e.g. studio → sudio, kaupunki → kupunki). The number of correct responses was calculated. In the Syllable Reversal test (Leinonen et al., 2001) 10 words and 10 pseudowords with 5–9 letters and with 3–4 syllables were presented via headphones and subjects were asked to change the order of the last two syllables and to say the new pseudoword aloud (e.g. aurinko → aukorin, rospiemi → rosmipie). The number of correct responses was calculated. For Phoneme Deletion and Syllable Reversal tests the vocal reaction times to the stimuli were measured from a microphone signal. In the Spelling test (Leinonen et al., 2001) the subjects were asked to spell to dictation 10 pseudowords and 10 words with 6–14 letters and with 2–7 syllables. The number of errors was calculated. These phonological tests were administered also to seven of the control subjects participating in this study.

MEG Measurement Procedure

Measurements were conducted in a magnetically shielded room. Stimulus presentation was controlled by the Presentation program (Neurobehavioral Systems Inc., San Francisco, CA) running on a PC. To normalize the stimulus intensities across subjects, individual hearing thresholds were determined before the actual measurement using simple 1 kHz tones of 50 ms with 15 ms rise and fall times. The stimuli were delivered to the subject through plastic tubes and earpieces at 65 dB above the subjective hearing threshold. The subjects were watching a silent film and were instructed to ignore the auditory stimuli.

There were two sessions. In the first session the subject heard a randomized sequence of vowel sounds and their nonspeech equivalents (synthetic /a/ and /u/, complex sound equivalents of /a/ and /u/, and tone equivalents of /a/ and /u/). In the second session, the stimuli were CV sounds and their nonspeech equivalents (synthetic /pa/ and /ka/, complex sound equivalents of /pa/ and /ka/, and tone equivalents of /pa/ and /ka/). The order of the sessions was randomized across subjects. Stimuli were separated by an interstimulus interval of 2 s and they were presented monaurally to the right ear to maximally engage the language-dominant left hemisphere. Each session lasted for 20–30 min and the sessions were separated by a 2–3 min break.

MEG Recordings

MEG signals were recorded using a helmet-shaped 306-channel whole-head system (Vectorview™, Neuromag Ltd, Helsinki, Finland) with two orthogonally oriented planar gradiometers and one magnetometer in 102 locations. Signals were bandpass filtered at 0.03–200 Hz, sampled at 600 Hz, and averaged on-line from 200 ms before stimulus onset to 800 ms after it. The horizontal and vertical electro-oculograms were recorded for on-line rejection of epochs contaminated by blinks or saccades. About 100 artifact-free epochs were gathered and averaged separately for each of the 12 stimulus categories. The position of the subject's head with respect to the measurement helmet was determined at the beginning of each measurement session by briefly energizing four head position indicator coils attached to the subject's head. The location of the coils was determined with respect to three anatomical landmarks (preauricular points and nasion) using a 3-D digitizer (Polhemus, Colchester, VT). The location of the active brain areas could thus be displayed on anatomical MR images after identification of the landmarks in the MR images.

Data Analysis

MEG signals were low-pass filtered at 40 Hz before further analysis. The activated areas were modeled as equivalent current dipoles (ECD), which represent the mean location, direction and strength of the current flowing in a given cortical patch (Hämäläinen et al., 1993). The ECDs were determined from standard subsets of 46 planar gradiometers (= 23 pairs) that covered the 100 ms auditory field pattern over each hemisphere. A spherical estimation was used to describe the conductivity profile of the brain. The sphere model was fitted to optimally describe the curvature of the temporal areas, using the individual anatomical MR images when available (eight control subjects and four dyslexic subjects), or a sphere model that was an average of the individual parameters from all our subjects with MRIs, calculated separately for males and females.

In every subject, ECDs were first determined separately for each stimulus. The goodness-of-fit of the obtained two-dipole models (one dipole in each hemisphere) varied from 85 to 95% across subjects and different stimuli. Within each subject, the source locations varied on average by 1 cm and the orientations of current flow by 25° across the different stimuli, in both hemispheres. The close similarity of the ECDs found in the different stimulus conditions made it possible to improve the signal-to-noise ratio by forming an average of the responses to all stimuli in each subject (four stimulus categories: two vowels and two syllables; three stimulus types: tone, complex sound, speech sound; 1090–1354 trials in total). The left- and right-hemisphere ECDs modeled in this averaged data set were then used to account for the MEG signals recorded for each stimulus. The locations and orientations of the two ECDs were kept fixed, while their amplitudes were allowed to vary to best explain the signals recorded by all sensors over the entire averaging interval. This common two-dipole model accounted for the MEG signals in each stimulus condition equally well as the two-dipole models which had been found separately for each stimulus condition (goodness-of-fit varied from 83 to 94%). The use of the common set of two ECDs for all conditions in each individual subject made it possible to directly compare the time behavior of activation in these cortical areas (source waveforms) across all stimuli.

Statistical Tests

A repeated-measures analysis of variance (ANOVA) with stimulus category (/a/, /u/, /pa/, /ka/), stimulus type (speech sound, complex nonspeech sound, simple tone) and hemisphere (left, right) as within-subjects factors was used for evaluating systematic effects in activation strengths and latencies within each subject population. Source locations were tested, separately for each spatial dimension (x = axial plane from left ear to right ear, y = axial plane, orthogonal to x, towards the nasion, z = sagittal plane from inferior to superior), and orientations of current flow were also tested. For group comparisons, a mixed-model ANOVA was employed with group (controls, dyslexics) as the between-subjects factor.

For behavioral tests, the reaction times and error scores between subject groups were analyzed using Student's t-test. To test for correlations between phonological abilities and cortical measures we calculated Pearson's correlation coefficient.

Results

Neuroimaging Results in Normally Reading Subjects

Figure 2 illustrates examples of MEG signals recorded in one subject. Responses to different sound types (speech sound, complex nonspeech sound and simple tone) are presented on the MEG sensors that showed the maximum amplitude over the left and right auditory cortex. Figure 3 shows the group mean location of the equivalent current dipoles that best represented the activated cortical areas in each subject, superposed on an MR image averaged across the control subjects (Schormann et al., 1996; Woods et al., 1998a, 1998b). In a few cases, the dipoles were found in the Heschl's gyrus but mostly they were localized to Heschl's sulcus or posterolateral to it.

Figure 2.

MEG responses evoked by speech sound /a/, complex sound equivalent of /a/ and simple tone equivalent of /a/ (black, gray and dashed line, respectively), recorded by two selected sensors over the left and right temporal cortex in one subject.

Figure 2.

MEG responses evoked by speech sound /a/, complex sound equivalent of /a/ and simple tone equivalent of /a/ (black, gray and dashed line, respectively), recorded by two selected sensors over the left and right temporal cortex in one subject.

Figure 3.

The mean N100m response location in the left and right hemisphere, and the mean time behavior of activation for speech sound /pa/ (black line), its complex nonspeech equivalent (gray line) and simple tone equivalent (dashed line). The Sylvian fissure is highlighted in the MR images.

Figure 3.

The mean N100m response location in the left and right hemisphere, and the mean time behavior of activation for speech sound /pa/ (black line), its complex nonspeech equivalent (gray line) and simple tone equivalent (dashed line). The Sylvian fissure is highlighted in the MR images.

The mean time courses of activation (Fig. 3) were qualitatively similar for all stimulus categories (/a/, /u/, /pa/, /ka/) and all stimulus types (speech sound, complex sound, simple tone). After a small negative dip the signal started to increase at ∼50 ms after stimulus onset, reached the maximum at ∼100 ms (N100m), and remained at a fairly low level after ∼200 ms. The right-hemisphere sources were located on average 6 mm anterior to the left-hemisphere sources [F(1,9) = 6.1, P < 0.05], in agreement with previous reports (e.g. Elberling et al., 1982; Kaukoranta et al., 1987). There were no systematic differences in source locations between different categories (/a/, /u/, /pa/, /ka/). Small differences in locations and orientations emerged between different stimulus types (speech, complex nonspeech sounds, and simple tones) but in absolute terms they were negligibly small, 1–3 mm in mean location, and 2–7 degrees in mean orientation.

Strength of N100m Response

The strength of the N100m response (Table 1 and Fig. 4a) varied by stimulus type in the left hemisphere but not in the right hemisphere [stimulus type, F(2,18) = 10.2, P < 0.001; and stimulus type-by-hemisphere: interaction F(2,18) = 13.4, P < 0.001]. In the left hemisphere, the responses were stronger to speech sounds than to complex nonspeech sounds and simple tones [F(2,18) = 14.7, P < 0.001]. The effect of stimulus type was significant for all stimulus categories (a: P < 0.001, u: P < 0.001, pa: P < 0.01, ka: P < 0.001).

Figure 4.

Mean (+ SEM) strength of the N100m activation for the control (a) and dyslexic group (b). Responses in the contralateral left hemisphere are shown on the left and those in the ipsilateral right hemisphere on the right. Speech sounds, complex nonspeech sounds and simple tones are represented by black, gray and white bars, respectively. The dashed line represents the mean amplitude of activation across all sounds in the left hemisphere of the control subjects.

Figure 4.

Mean (+ SEM) strength of the N100m activation for the control (a) and dyslexic group (b). Responses in the contralateral left hemisphere are shown on the left and those in the ipsilateral right hemisphere on the right. Speech sounds, complex nonspeech sounds and simple tones are represented by black, gray and white bars, respectively. The dashed line represents the mean amplitude of activation across all sounds in the left hemisphere of the control subjects.

Table 1

N100m source strengths and latencies in the left and right hemisphere in control (Cont) and dyslexic (Dys) subjects for speech sounds, complex non-speech sounds and simple tones (mean ± SEM)

 Activation strength (nAm)
 
     Peak latency (ms)
 
     
 /speech/
 
 /complex/
 
 /tone/
 
 /speech/
 
 /complex/
 
 /tone/
 
 

 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Left hemisphere             
    /a/ 64 ± 7 61 ± 7 56 ± 7 58 ± 8 46 ± 5 49 ± 7 95 ± 3 106 ± 3 97 ± 3 101 ± 3 92 ± 4 100 ± 3 
    /u/ 68 ± 7 62 ± 7 57 ± 7 51 ± 9 53 ± 7 50 ± 7 103 ± 5 112 ± 7 96 ± 3 109 ± 5 94 ± 2 98 ± 4 
    /pa/ 63 ± 7 57 ± 6 49 ± 6 51 ± 7 45 ± 5 46 ± 7 103 ± 2 109 ± 6 96 ± 3 106 ± 5 92 ± 5 98 ± 8 
    /ka/ 60 ± 8 61 ± 6 48 ± 6 48 ± 6 40 ± 5 43 ± 7 102 ± 3 109 ± 4 97 ± 2 104 ± 3 91 ± 2 98 ± 3 
Right hemisphere             
    /a/ 59 ± 8 43 ± 4 63 ± 8 47 ± 5 54 ± 7 39 ± 6 108 ± 3 107 ± 4 108 ± 3 105 ± 4 103 ± 4 102 ± 6 
    /u/ 57 ± 8 40 ± 4 60 ± 8 41 ± 5 55 ± 7 36 ± 3 113 ± 5 111 ± 7 109 ± 2 112 ± 3 103 ± 2 103 ± 3 
    /pa/ 53 ± 7 39 ± 4 52 ± 7 41 ± 5 52 ± 7 37 ± 4 113 ± 2 115 ± 4 108 ± 2 110 ± 3 104 ± 5 103 ± 4 
    /ka/
 
55 ± 7
 
41 ± 4
 
52 ± 8
 
38 ± 4
 
48 ± 7
 
35 ± 5
 
109 ± 3
 
112 ± 7
 
109 ± 2
 
110 ± 4
 
106 ± 2
 
103 ± 4
 
 Activation strength (nAm)
 
     Peak latency (ms)
 
     
 /speech/
 
 /complex/
 
 /tone/
 
 /speech/
 
 /complex/
 
 /tone/
 
 

 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Cont
 
Dys
 
Left hemisphere             
    /a/ 64 ± 7 61 ± 7 56 ± 7 58 ± 8 46 ± 5 49 ± 7 95 ± 3 106 ± 3 97 ± 3 101 ± 3 92 ± 4 100 ± 3 
    /u/ 68 ± 7 62 ± 7 57 ± 7 51 ± 9 53 ± 7 50 ± 7 103 ± 5 112 ± 7 96 ± 3 109 ± 5 94 ± 2 98 ± 4 
    /pa/ 63 ± 7 57 ± 6 49 ± 6 51 ± 7 45 ± 5 46 ± 7 103 ± 2 109 ± 6 96 ± 3 106 ± 5 92 ± 5 98 ± 8 
    /ka/ 60 ± 8 61 ± 6 48 ± 6 48 ± 6 40 ± 5 43 ± 7 102 ± 3 109 ± 4 97 ± 2 104 ± 3 91 ± 2 98 ± 3 
Right hemisphere             
    /a/ 59 ± 8 43 ± 4 63 ± 8 47 ± 5 54 ± 7 39 ± 6 108 ± 3 107 ± 4 108 ± 3 105 ± 4 103 ± 4 102 ± 6 
    /u/ 57 ± 8 40 ± 4 60 ± 8 41 ± 5 55 ± 7 36 ± 3 113 ± 5 111 ± 7 109 ± 2 112 ± 3 103 ± 2 103 ± 3 
    /pa/ 53 ± 7 39 ± 4 52 ± 7 41 ± 5 52 ± 7 37 ± 4 113 ± 2 115 ± 4 108 ± 2 110 ± 3 104 ± 5 103 ± 4 
    /ka/
 
55 ± 7
 
41 ± 4
 
52 ± 8
 
38 ± 4
 
48 ± 7
 
35 ± 5
 
109 ± 3
 
112 ± 7
 
109 ± 2
 
110 ± 4
 
106 ± 2
 
103 ± 4
 

When the stimuli were speech sounds, the strength of the N100m response was similar for all stimulus categories (/a/, /u/, /pa/, /ka/). However, for complex and simple nonspeech sounds there was a significant variation in the N100m strength by stimulus category in both hemispheres [stimulus category, F(3,27) = 4.3, P < 0.05; stimulus type-by-category interaction, F(6,54) = 2.4, P < 0.05; speech sounds alone, F(3,27) = 1.8, P = 0.2; complex sounds alone, F(27,3) = 5.0, P < 0.01; sine wave tones alone, F(3,27) = 4.4, P < 0.05].

Timing of N100m Response

The onset latency (time point when signal crosses the level of standard deviation of the prestimulus baseline) did not show systematic variation with sound type. However, the build-up of the N100m response in the left and right hemisphere differentiated between speech and nonspeech sounds (Fig. 5). For speech sounds, the ascending slope of the N100m response (increase of amplitude versus time) was steeper in the left than right hemisphere but, for the nonspeech sounds, there was no significant difference between the two hemispheres [stimulus type-by-hemisphere interaction, F(2,18) = 4.2, P < 0.05; hemisphere effect for speech sounds F(1,9) = 7.8, P < 0.05, complex sounds F(1,9) = 2.8, P = 0.1, and sine wave tones F(1,9) = 2.5, P = 0.2)].

Figure 5.

Schematic illustration of the build-up of the N100m response (from onset to peak) to speech sound (top), complex nonspeech sound (middle) and simple tone (bottom) in the left (thick lines) and right (thin lines) hemispheres of control (left) and dyslexic subjects (right). The data are shown for the sound category /a/.

Figure 5.

Schematic illustration of the build-up of the N100m response (from onset to peak) to speech sound (top), complex nonspeech sound (middle) and simple tone (bottom) in the left (thick lines) and right (thin lines) hemispheres of control (left) and dyslexic subjects (right). The data are shown for the sound category /a/.

The N100m response reached its maximum on average 2–5 ms later for speech than complex nonspeech sounds and 7–9 ms later than for tones [F(2,18) = 4.8, P < 0.05], similarly in both hemispheres. The responses to all sounds reached the maximum earlier in the contralateral left hemisphere (96 ± 11 ms, mean ± SEM) than in the ipsilateral right hemisphere (108 ± 9 ms) [F(1,9) = 52.0, P < 0.001], in agreement with previous reports on monaural auditory stimulation (e.g. Elberling et al., 1982; Mäkelä et al., 1993; see Table 1).

The effect of stimulated ear was subsequently tested in 7 of the 10 subjects that participated in the original study. Stimuli presented to the left ear (/a/ and /pa/ and their nonspeech equivalents) evoked a similar activation pattern as stimuli presented to the right ear (Fig. 6). In the left hemisphere, activation was stronger to speech than complex and simple nonspeech sounds but in the right hemisphere no general effect of stimulus level was detected [effect of level, F(2,12) = 9.3, P < 0.01; level-by-hemisphere interaction, F(2,12) = 5.0, P < 0.05]. Thus, the sensitivity of the N100m strength in different hemispheres for speech versus nonspeech sounds was not affected by changing the stimulated ear.

Figure 6.

Mean (+ SEM) strength of the N100m activation in the left and right hemispheres for left-ear stimulation with speech sounds, complex nonspeech sounds and simple tones (black, gray and white bars, respectively).

Figure 6.

Mean (+ SEM) strength of the N100m activation in the left and right hemispheres for left-ear stimulation with speech sounds, complex nonspeech sounds and simple tones (black, gray and white bars, respectively).

Neuroimaging Results in Dyslexic versus Control Subjects

There were no systematic group differences in the location of the activated areas. As in controls, the source location was slightly affected by stimulus type (1–3 mm between speech and nonspeech conditions).

Comparison of N100m Strength in the Two Subject Groups

The N100m source strength showed no main effect of subject group, nor significant interactions. Thus, similarly to controls, also in the dyslexic subjects the N100m strength differentiated between speech and nonspeech sounds in the left hemisphere [F(2,18) = 8.2, P < 0.01] but not in the right hemisphere [F(2,18) = 1.5, P = 0.2] (Fig. 4b). However, in the right hemisphere there was a tendency towards generally weaker activation in the dyslexic than control subjects [effect of group in right hemisphere F(1,18) = 3.6, P = 0.08]. In a separate ANOVA for dyslexic subjects, the N100m strength differed significantly between the hemispheres [left 53 ± 7 nAm, right 40 ± 4 nAm, F(1,9) = 5.5, P < 0.05], while in the control subjects the overall level of activation between the hemispheres was very similar [left 54 ± 7 nAm, right 55 ± 7 nAm, F(1,9) = 0.01, P = 0.9].

Comparison of the N100m Timing in the Two Subject Groups

The build-up of the N100m response showed a subtle effect of subject group for speech sounds but not for nonspeech sounds [effect of group for speech sounds, F(1,18) = 4.9, P < 0.05; complex nonspeech sounds, F(1,18) = 0.9, P = 0.3; sine wave tones, F(1,18) = 1.9, P = 0.2]. The N100m for speech sounds was found to rise more gradually in dyslexic than control subjects, similarly in both hemispheres.

The peak latency of the N100m response (Table 1) showed a significant group-by-hemisphere interaction [F(1,18) = 5.4, P < 0.05]. In a separate analysis for each hemisphere the peak latency in the left hemisphere tended to be longer in dyslexic than control subjects, but this difference only approached significance [F(1,18) = 3.0, P = 0.1]. In the right hemisphere, the groups showed very similar timing of activation [F(1,18) = 0.007, P = 0.9]. When the dyslexic subjects were tested separately, the typical pattern of an earlier response in the contralateral left than ipsilateral right hemisphere found in controls was not evident (left 104 ± 5 ms, right 108 ± 5 ms) (see Fig. 5). Nevertheless, the response to simple tones reached the maximum first and the response to speech sounds last, similarly in both hemispheres, as in the control group [main effect of stimulus type F(2,18) = 7.1, P < 0.01].

Behavioral Results and Correlations to MEG Responses in Dyslexic versus Control Subjects

All the dyslexic subjects had normal intelligence, as measured by the general linguistic and non-linguistic cognitive tests (WAIS-R, WMS-R) (Table 2). The dyslexic subjects were significantly slower than normally reading controls in Oral Reading test [mean difference 59 words, t(46) = 5.8, P < 0.001] and Rapid Naming tests [mean difference in RAS 9 s, t(46) = −5.0, P < 0.001; and in RAN 5 s, t(23) = −2.5, P < 0.05]. Control subjects in the present study (7/10 tested) did not differ from the larger normative data set in either Oral Reading [t(35) = −0.7, P = 0.5] or Rapid Naming [RAS, t(35) = 1.1, P = 0.3; RAN, t(12) = 1.3, P = 0.2]. In the more specific phonological tests the dyslexic subjects were significantly slower and more error-prone than the control subjects. The reaction times of the dyslexic individuals were longer than those of the control subjects in the auditorily presented Phoneme Deletion test [difference on average 3.7 s, t(15) = −4.6, P < 0.001] and Syllable Reversal test [difference on average 5.4 s, t(15) = −4.8, P < 0.001]. Dyslexic subjects also made significantly more errors in the Phoneme Deletion [t(15) = 2.5, P < 0.05], Syllable Reversal [t(15) = 2.3, P < 0.05] and Spelling tests [t(15) = −2.9, P < 0.05] than did control subjects.

Table 2

Results of behavioral tests


 
Dyslexics (10)
 
Normative data
 
Significance level
 
Verbal and nonverbal intelligencea    
    WAIS-R Similarities 87 – 116 85 – 115  
    WAIS-R Comprehension 104 – 140 85 – 115  
    WAIS-R Vocabulary 86 – 122 85 – 115  
    WAIS-R Digit Span 82 – 122 85 – 115  
    WAIS-R Block design 91 – 150 85 – 115  
    WMS-R Visual Span 15 – 21 15 – 21  
Reading testsb    
    Oral Reading (words/minute) 105 ± 31 164 ± 28 0.001 
    RAS (total time, s) 33 ± 6 24 ± 5 0.001 
    RAN (total time, s) 29 ± 4 24 ± 5 0.05 
Phonological testsc    
    Phoneme deletion (reaction time, s) 6.0 ± 1.9 2.3 ± 1.2 0.001 
    Phoneme deletion (score, max 16) 12 ± 4 16 ± 1 0.05 
    Syllable reversal (reaction time, s) 9.5 ± 2.2 4.2 ± 2.4 0.001 
    Syllable reversal (score, max 20) 15 ± 2 18 ± 3 0.05 
    Spelling (number of errors)
 
5 ± 3
 
1 ± 1
 
0.05
 

 
Dyslexics (10)
 
Normative data
 
Significance level
 
Verbal and nonverbal intelligencea    
    WAIS-R Similarities 87 – 116 85 – 115  
    WAIS-R Comprehension 104 – 140 85 – 115  
    WAIS-R Vocabulary 86 – 122 85 – 115  
    WAIS-R Digit Span 82 – 122 85 – 115  
    WAIS-R Block design 91 – 150 85 – 115  
    WMS-R Visual Span 15 – 21 15 – 21  
Reading testsb    
    Oral Reading (words/minute) 105 ± 31 164 ± 28 0.001 
    RAS (total time, s) 33 ± 6 24 ± 5 0.001 
    RAN (total time, s) 29 ± 4 24 ± 5 0.05 
Phonological testsc    
    Phoneme deletion (reaction time, s) 6.0 ± 1.9 2.3 ± 1.2 0.001 
    Phoneme deletion (score, max 16) 12 ± 4 16 ± 1 0.05 
    Syllable reversal (reaction time, s) 9.5 ± 2.2 4.2 ± 2.4 0.001 
    Syllable reversal (score, max 20) 15 ± 2 18 ± 3 0.05 
    Spelling (number of errors)
 
5 ± 3
 
1 ± 1
 
0.05
 
a

The results (range of scores) of dyslexic subjects is compared against published normative data (Finnish standardization, Psykologien Kustannus Oy, 1992 WAIS-R and 1996 WMS-R).

b

The results (mean ± SD) of dyslexic subjects is compared against normative data collected in our laboratory (n = 38 for Oral Reading and RAS, n = 15 for RAN).

c

The results (mean ± SD) of dyslexic subjects is compared against the results of control subjects in this study (7/10).

To test for correlation between brain responses and behavioral measures, the scores for each test were standardized to z-scores (i.e. individual score minus the mean score over all subjects, divided by standard deviation). We found no significant correlation between phonological abilities and the N100m strength or peak latency. We also tested the phonological scores against the difference of the N100m peak latencies between the hemispheres (Fig. 7a) and the ratio of the N100m activation strengths (Fig. 7b), as the MEG results suggested these to be more meaningful cortical measures. In the control subjects, better phonological skills were associated with a shorter ipsi-contra delay in the N100m response latency (r = −0.8, P < 0.05). In the dyslexic subjects, there was no significant correlation (r = −0.5, P = 0.1). No significant correlations were found between phonological scores and the left versus right N100m strength ratio.

Figure 7.

N100m latency difference (a) and ratio of activation strengths (b) between hemispheres in control and dyslexic subjects (white and black spheres, respectively), plotted against the behavioral performance (normalized average over six phonological tests). Regression lines are shown for the significant correlations.

Figure 7.

N100m latency difference (a) and ratio of activation strengths (b) between hemispheres in control and dyslexic subjects (white and black spheres, respectively), plotted against the behavioral performance (normalized average over six phonological tests). Regression lines are shown for the significant correlations.

Discussion

N100m Reflects Speech-sensitive Analysis in Normally Reading Subjects

The N100m response was fastest to simple tones. The peak latency was systematically delayed to complex sounds and, further, to speech sounds, similarly in both hemispheres. However, the strength of the N100m activation displayed interesting hemispheric specialization. The responses were stronger for speech than nonspeech sounds in the left auditory cortex but not in the right auditory cortex, independent of the stimulated ear. Thus, while both hemispheres were involved in the analysis of all sound types, the relative contribution of the left auditory cortex was increased when the stimuli were speech sounds.

The present findings agree with and extend earlier reports on speech/nonspeech processing and N100m, which have shown stronger amplitude for vowels than piano notes or tones (Gootjes et al., 1999), longer latencies for vowels than tones (Eulitz et al., 1995; Tiitinen et al., 1999) or leftward shift of hemispheric balance for natural vowels as compared with complex tones (Vihla and Salmelin, 2003). Using acoustically carefully matched speech and nonspeech sounds, we demonstrate that these effects are likely to be tied together. The increase of amplitude in speech sound analysis is lateralized to the left hemisphere, resulting in a leftward shift of activation when hearing speech sounds. The increase in latency for speech sounds occurs bilaterally. We also show that the leftward shift of activation is not markedly affected by the acoustic structure of the speech stimuli (vowels, CV syllables).

One may picture the build-up of the N100m response as a signature of a process where an ever-larger number of auditory cortical neurons are firing synchronously. For a constant rate of neuronal recruitment, a delay in the peak latency would be associated with stronger peak activation. The combined increase of peak latency and N100m strength for speech versus complex versus simple nonspeech sounds in the left hemisphere could certainly be interpreted this way. On the other hand, the right-hemisphere effect of increasing peak latency with no accompanying changes in activation strength suggests a slower rate of neuronal recruitment or less synchronous firing of neuronal populations for increasing sound complexity.

Interestingly, the ascending slope of the N100m response was significantly steeper in the left than right hemisphere for speech sounds but more similar in the two hemispheres for the nonspeech sounds. This observation speaks for a qualitative difference between the analysis of speech and nonspeech sounds in the left auditory cortex by 100 ms. It thus appears that, on top of acoustic processing per se which may be affected by varying the spectral composition or temporal structure of the sounds, the N100m response may also reflect speech-specific processing.

At the cellular level, speech-specificity could mean that neurons generating the response prefer sounds that form phonetically (linguistically) relevant combinations of acoustic features. Acoustically, speech sounds do not have any single unique property different from nonspeech sounds but rather represent particular (unique) combinations of different properties (Stevens, 1980). Although there is plenty of information available on how phonetically important features are encoded in the cochlear nucleus and auditory nerve (see e.g. Delgutte, 1999), the combinations of features in speech sounds that are critical for analysis at the cortical level are less well defined. The present study implies that a simple combination of formant frequencies does not suffice as the N100m response differed from that evoked by simple speech sounds.

‘Combination sensitive’ neurons, originally proposed by Suga et al. (1978) in a study on auditory system of echolocating bats, have been investigated in a number of animal species and recently also in nonhuman primates (Rauschecker et al., 1995). In the macaque, neurons located posterior to the primary auditory cortex of the left hemisphere (roughly corresponding to the location of our N100m source areas) responded better to complex sounds, e.g. species-specific calls, than to simple tones (Rauschecker et al., 1995). This kind of preference is suggested to be the result of nonlinear summation of inputs from more narrowly tuned neurons in the primary auditory cortex (Rauschecker et al., 1995; Rauschecker, 1998).

Some degree of correspondence between nonhuman primates and humans is suggested by the observation that increased stimulus complexity (band-passed noise versus pure tones) results in similarly enhanced activation in humans, in corresponding areas posterior to the primary auditory cortex (Wessinger et al., 2001). However, as phonetics of human speech is not directly comparable to animal communication sounds, nor is it known whether analysis of speech sounds uses the same computations as other complex sounds, these observations cannot be unequivocally linked to human speech perception.

During the recent years, much has been learned about the functional anatomy of auditory processing of complex sounds in humans but detailed information about the neural processes still remains largely unestablished. At the anatomical level, it is known that the primary auditory cortex located in the Heschl's gyrus is surrounded by non-primary auditory areas anteriorily, laterally and posteriorily (for a review, see Hall et al., 2003). With time-sensitive imaging methods it has been shown that by 100 ms the activation is largely generated in nonprimary auditory areas posterior and lateral to the primary auditory cortex, in the planum temporale (PT) (Liegeois-Chavel et al., 1994; Lütkenhöner and Steinstrater, 1998).

Some hemodynamic studies of speech and nonspeech processing have suggested a linguistically specialized role for the PT and the surrounding cortex (Zatorre et al., 1992; Benson et al., 2001; Vouloumanos et al., 2001), while other studies have seen it as part of a basic acoustic analysis network and thus relevant for processing of both speech and nonspeech sounds (Binder et al., 1996, 2000). In agreement with the latter view, the N100 response is generated to any kind of abrupt change in the auditory environment (Hari, 1990). Here, we found a strong N100m response to both speech and nonspeech sounds which showed a small but significant modulation by the speech content of the stimulus. Taking into account the inertia of blood-flow measures, stimulus-dependent variation of transient neural responses like the N100m may well go undetected in PET or fMRI. The different time windows accessible with the different imaging methods may have a considerable effect on which part of the network is detected. Our MEG results suggest that at 100 ms after stimulus onset, activation of the PT and the adjacent auditory cortex reflects acoustic but also speech-specific analysis.

What is the exact nature of the linkage between speech specific properties in sound and neuronal firing remains to be clarified. Based on her psychoacoustical experiments, Kuhl (2000) has proposed that the statistical properties of auditory input shape the auditory processing system in infancy to enhance language perception. This view would suggest that, whatever the critical feature combinations in speech may be, experience has a major role in creating the sensitivity for speech.

Implications for Acoustic versus Speech-specific Analysis in Dyslexia

The pattern of speech versus nonspeech differentiation in control subjects was reproduced in the dyslexic group. However, group differences emerged in the interhemispheric timing of the N100m response and in the overall balance of the N100m activation strength, similarly for speech and nonspeech sounds. In controls, the response was earlier in the left (contralateral) than right (ipsilateral) hemisphere, but in dyslexics the left hemisphere response was delayed and N100m reached the maximum at the same time in the left and right hemispheres. Furthermore, the right-hemisphere responses were weaker than the left-hemisphere responses whereas in the control group the overall level of activation was similar across the two hemispheres.

The unusual timing and amplitude effects could reflect separate processes but they can also be readily understood as components of a single process. As the activation in contralateral auditory cortex is thought to modulate the ipsilateral auditory cortex via callosal connections (Mäkelä and Hari, 1992; Oe et al., 2002), a delay in the left-hemisphere N100m response could reduce the strength of the right-hemisphere N100m. This would result in the combination of timing and amplitude effects observed in our dyslexic subjects. Why is the left-hemisphere N100m response delayed in dyslexic individuals? Normally, the contra- and ipsilateral N100m responses are systematically slower in the left than in the right hemisphere for simple tones (Salmelin et al., 1999). The longer processing time in the left hemisphere may be related to stronger connections between the Heschl's gyrus (primary auditory cortex) and the adjacent PT in the left than right hemisphere (Penhune et al., 1996). Any irregularities in this interaction could cause a delay in the build-up of the N100m response. Interestingly, abnormalities in the development of the left PT (or left versus right PT) and perisylvian regions have been suggested by post-mortem (e.g. Galaburda et al., 1985; for a review, see Galaburda, 1993), anatomical MRI (e.g Hynd et al., 1990; Leonard et al., 1993) and animal studies (for a review, see Galaburda, 1994), which could affect the interaction between Heschl's gyrus and PT and, further, the N100m response to auditory stimuli. However, it is important to note that the relationship between abnormalities of the planum temporale and dyslexia may be more complex, varying e.g. with hand preference and general verbal ability (see e.g. Rumsey et al., 1997; Eckert and Leonard, 2000).

The present data suggest changes in general auditory processing in dyslexia in the time window when speech-specific information is extracted and the (left) PT becomes involved in the process. As the stimuli were delivered to the right ear only, we must remain cautious about the hemispheric specificity of the effect. In a PET study of word repetition, McCrory et al. (2000) used binaural stimuli and found abnormally weak activation of the right auditory cortex in dyslexic adults, which would speak for hemisphere-specific effects. McCrory et al. (2000) interpreted their finding as reflecting particular emphasis on phonetic (left hemisphere) and de-emphasis on non-phonetic (right hemisphere) auditory processing in dyslexia. In the present data set, however, reduced right-hemisphere activation was detected for speech and nonspeech stimuli alike during passive listening, thus rendering a purely linguistic explanation rather unlikely.

To allow direct comparison between speech and nonspeech sounds, the stimuli were acoustically matched as well as possible, and they were as simple as possible. Therefore, it is not reasonable to directly compare the present data with previous MEG studies of speech or nonspeech processing in dyslexia which used rapidly successive nonspeech sounds (Nagarajan et al., 1999), paired speech or nonspeech sounds not matched for intensity (Helenius et al., 2002a), or natural speech sounds (Helenius et al., 2002b) on quite specific groups of dyslexics (pronounced auditory problems, strong family history of dyslexia). Nevertheless, the important common finding in all these studies is that differences in auditory processing between control and dyslexic groups were found in the N100m response.

To conclude, we provide evidence that activation arising from the PT and the surrounding auditory cortex at 100 ms after sound onset is sensitive to phonetic content in the speech signal. This claim is based on the significant increase in activation strength and rate of signal build-up in the left hemisphere for speech sounds as compared with complex and simple nonspeech sounds. In dyslexic subjects, the altered hemispheric balance in both activation strength and timing are proposed to be linked to abnormalities within the left PT or in the communication between the PT and the primary auditory cortex which affect all auditory processing, including phonetic analysis. A general auditory impairment within the time window of phonetic analysis is consistent with reports on both phonological impairment (Rumsey et al., 1992; Studdert-Kennedy and Mody, 1995; Mody et al., 1997; Helenius et al., 2002a) and on basic auditory deficit (Tallal et al., 1993; Hari and Kiesilä, 1996; Fitch et al., 1997; Ahissar et al., 2000; Amitay et al., 2002; Renvall and Hari, 2002) in dyslexia.

This study was supported by European Union Fifth Framework Programme (grant no. QLK6-CT-1999-02140) and the Academy of Finland (grant no. 44879, Finnish Centre of Excellence Programme 2000–2005). The MRIs were obtained at the Department of Radiology, Helsinki University Central Hospital. We thank Seija Leinonen for providing the phonological test material. We thank Seija Leinonen for providing the phonological test material and Mika Seppä for help in transforming the individual subject data to averaged MR images.

References

Ahissar M, Protopapas A, Reid M, Merzenich MM (
2000
) Auditory processing parallels reading abilities in adults.
Proc Natl Acad Sci USA
 
97
:
6832
–6837.
Alho K (
1995
) Cerebral generators of mismatch negativity (MMN) and its magnetic counterpart (MMNm) elicited by sound changes.
Ear Hear
 
16
:
38
–51.
Amitay S, Ahissar M, Nelken I (
2002
) Auditory processing deficits in reading disabled adults.
J Assoc Res Otolaryngol
 
3
:
302
–320.
Aulanko R, Hari R, Lounasmaa O, Näätänen R, Sams M (
1993
) Phonetic invariance in the human auditory cortex.
Neuroreport
 
4
:
1356
–1358.
Baldeweg T, Richardson A, Watkins S, Foale C, Gruzelier J (
1999
) Impaired auditory frequency discrimination in dyslexia detected with mismatch evoked potentials.
Ann Neurol
 
45
:
495
–503.
Benson R, Whalen DH, Richardson M, Swainson B, Clark VP, Lai S, Liberman AM (
2001
) Parametrically dissociating speech and nonspeech perception in the brain using fMRI.
Brain Lang
 
78
:
364
–396.
Binder JR, Rao SM, Hammeke TA, Yetkin FZ, Jesmanowicz A, Bandettini PA, Wong EC, Estkowski LD, Goldstein MD, Haughton WM, Hyde JS (
1994
) Functional magnetic resonance imaging of human auditory cortex.
Ann Neurol
 
35
:
662
–672.
Binder JR, Frost JA, Hammeke TA, Rao SM, Cox RW (
1996
) Function of the left planum temporale in auditory and linguistic processing.
Brain
 
119
:
1239
–1247.
Binder J, Frost J, Hammeke T, Bellgowan P, Springer J, Kaufman J, Possing E (
2000
) Human temporal lobe activation by speech and nonspeech sounds.
Cereb Cortex
 
10
:
512
–528.
Bradley L, Bryant P (
1983
) Categorizing sounds and learning to read — a causal connection.
Nature
 
301
:
419
–421.
Connolly JF, Phillips NA (
1994
) Event-related potential components reflect phonological and semantic processing of the terminal word of spoken sentence.
J Cogn Neurosci
 
6
:
256
–266.
Delgutte B (
1999
) Auditory neural processing of speech. In: The handbook of phonetic sciences (Hardcastle W, Laver J, eds), pp. 507–538. Oxford: Blackwell Publishers.
Demonet J, Chollet F, Ramsay S, Cardebat D, Nespoulous JL, Wise R, Rascol A, Frackowiak R (
1992
) The anatomy of phonological and semantic processing in normal subjects.
Brain
 
115
:
1753
–1768.
Denckla M, Rudel R (
1976
) Rapid ‘automatized’ naming (R.A.N): dyslexia differentiated from other learning disabilities.
Neuropsychologia
 
14
:
471
–479.
Eckert M, Leonard C (
2000
) Structural imaging in dyslexia: the planum temporale.
Ment Retard Dev Disabil Res Rev
 
6
:
198
–206.
Elberling C, Bak C, Kofoed B, Lebech J, Saermark K (
1982
) Auditory magnetic fields from the human cerebral cortex. Location and strength of an equivalent current dipole.
Acta Neurol Scand
 
65
:
553
–569.
Eulitz C, Diesch E, Pantev C, Hampson S, Elbert T (
1995
) Magnetic and electric brain activity evoked by the processing of tone and vowel stimuli.
J Neurosci
 
15
:
2748
–2755.
Fitch RH, Miller S, Tallal P (
1997
) Neurobiology of speech perception.
Annu Rev Neurosci
 
20
:
331
–353.
Galaburda AM (
1993
) Neuroanatomic basis of developmental dyslexia.
Neurol Clin
 
11
:
161
–173.
Galaburda AM (
1994
) Developmental dyslexia and animal studies: at the interface between cognition and neurology.
Cognition
 
50
:
133
–149.
Galaburda AM, Sherman GF, Rosen F, Aboitiz N, Geschwind N (
1985
) Developmental dyslexia: four consecutive patients with cortical anomalies.
Ann Neurol
 
18
:
222
–233.
Gootjes L, Raij T, Salmelin R, Hari R (
1999
) Left-hemisphere dominance for processing of vowels: a whole-scalp neuromagnetic study.
Neuroreport
 
10
:
2987
–2991.
Hall D, Hart H, Johnsrude I (
2003
) Relationships between human auditory cortical structure and function.
Audiol Neurootol
 
8
:
1
–18.
Hari R (
1990
) The neuromagnetic method in the study of the human auditory cortex. In: Auditory evoked magnetic fields and potentials: advances in audiology (Grandori F, Hoke M, Romani G, eds), pp. 222–282. Basel: S. Karger.
Hari R, Kiesilä P (
1996
) Deficit of temporal auditory processing in dyslexic adults.
Neurosci Lett
 
205
:
138
–140.
Helenius P, Salmelin R, Richardson U, Leinonen S, Lyytinen H (
2002
a) Abnormal auditory cortical activation in dyslexia 100 ms after speech onset.
J Cogn Neurosci
 
15
:
603
–617.
Helenius P, Salmelin R, Service E, Connolly JF, Leinonen S, Lyytinen H (
2002
b) Cortical activation during spoken-word segmentation in nonreading-impaired and dyslexic adults.
J Neurosci
 
22
:
2936
–2944.
Hynd GW, Semrud-Clickman M, Larys AR (
1990
) Brain morphology in developmental dyslexia and attention deficit disorder/hyperactivity.
Arch Neurol
 
47
:
919
–926.
Hämäläinen M, Hari R, Ilmoniemi R, Knuutila J, Lounasmaa O (
1993
) Magnetoencephalography — theory, instrumentation, and applications to noninvasive studies of the working human brain.
Rev Modern Phys
 
65
:
413
–497.
Iivonen A, Laukkanen AM (
1993
) Explanations for the qualitative variation of Finnish vowels. In: Studies in logopedics and phonetics 4 (Iivonen A, Lehtihalmes M, eds.), pp. 29–54. Helsinki: University of Helsinki.
Kaukoranta E, Hari R, Lounasmaa OV (
1987
) Responses of the human auditory cortex to vowel onset after fricative consonants.
Exp Brain Res
 
69
:
19
–23.
Klatt D (
1980
) Software for a cascade/parallel formant synthesizer.
J Acoust Soc Am
 
67
:
971
–995.
Kuhl P (
2000
) A new view of language acquisition.
Proc Natl Acad Sci USA
 
97
:
11850
–11857.
Kuriki S, Murase M (
1989
) Neuromagnetic study of the auditory responses in right and left hemispheres of the human brain evoked by pure tones and speech sounds.
Exp Brain Res
 
77
:
127
–134.
Leinonen S, Müller K, Leppänen PHT, Aro M, Ahonen T, Lyytinen H (
2001
) Heterogeneity in adult dyslexic readers: relating processing skills to the speed and accuracy of oral text reading.
Read Writ
 
14
:
265
–296.
Leonard CM, Voeller KKS, Lombardino LJ, Morris MK, Hynd GW, Alexander AW, Andersen HG, Garofalakis M, Honeyman JC, Mao J, Agee OF, Staab EV (
1993
) Anomalous cerebral structure in dyslexia revealed with MRI.
Arch Neurol
 
50
:
461
–469.
Liegeois-Chavel C, Musolino A, Badier JM, Marquis P, Chauvel P (
1994
) Evoked potentials recorded from the auditory cortex in man: evaluation and topography of the middle latency components.
Electroencephalogr Clin Neurophysiol
 
92
:
204
–214.
Lütkenhöner B, Steinstrater O (
1998
) High-precision neuromagnetic study of the functional organization of the human auditory cortex.
Audiol Neurootol
 
3
:
191
–213.
Mäkelä J, Hari R (
1992
) Neuromagnetic auditory evoked responses after a stroke in the right temporal lobe.
Neuroreport
 
3
:
94
–96.
Mäkelä JP, Ahonen A, Hämäläinen M, Hari R, Ilmoniemi R, Kajola M, Knuutila J, Lounasmaa OV, McEvoy L, Salmelin R, Salonen O, Sams M, Simola J, Tesche C, Vasama J-P (
1993
) Functional differences between auditory cortices of the two hemispheres revealed by whole-head neuromagnetic recordings.
Hum Brain Mapp
 
1
:
48
–56.
McCrory E, Frith U, Brunswick N, Price C (
2000
) Abnormal functional activation during a simple word repetition task: a PET study of adult dyslexics.
J Cogn Neurosci
 
12
:
753
–762.
Mody M, Studdert-Kennedy M, Brady S (
1997
) Speech perception deficits in poor readers: auditory processing or phonological coding?
J Exp Child Psychol
 
64
:
199
–231.
Näätänen R (
1992
) Attention and brain function. Hillsdale, NJ: Erlbaum.
Näätänen R, Lehtokoski A, Lennes M, Cheour M, Huotilainen M, Iivonen A, Vainio M, Alku P, Ilmoniemi R, Luuk A, Allik J, Sinkkonen J, Alho K (
1997
) Language specific phoneme representations revealed by electric and magnetic brain responses.
Nature
 
385
:
432
–434.
Nagarajan S, Mahncke H, Salz T, Tallal P, Roberts T, Merzenich MM (
1999
) Cortical auditory signal processing in poor readers.
Proc Natl Acad Sci USA
 
96
:
6483
–6488.
Oe H, Kandori A, Yamada N, Miyashita T, Tsukada K, Naritomi H (
2002
) Interhemispheric connection of auditory neural pathways assessed by auditory evoked magnetic fields in patients with fronto-temporal lobe infarction.
Neurosci Res
 
44
:
483
–488.
Penhune VB, Zatorre RJ, MacDonald JD, Evans AC (
1996
) Interhemispheric anatomical differences in human primary auditory cortex: probablistic mapping and volume measurement from magnetic resonance scans.
Cereb Cortex
 
6
:
661
–672.
Phillips C (
2001
) Levels of representation in the electrophysiology of speech perception.
Cogn Sci
 
25
:
711
–731.
Phillips C, Pellathy T, Maranz A, Yellin E, Wexler K, Poeppel D, McGinnis M, Roberts T (
2000
) Auditory cortex accesses phonological categories: an MEG mismatch study.
J Cogn Neurosci
 
12
:
1038
–1055.
Rauschecker J (
1998
) Cortical processing of complex sounds.
Curr Opin Neurobiol
 
8
:
516
–521.
Rauschecker J, Tian B, Hauser M (
1995
) Processing of complex sounds in the macaque nonprimary auditory cortex.
Science
 
268
:
111
–114.
Renvall H, Hari R (
2002
) Auditory cortical responses to speech-like stimuli in dyslexic adults.
J Cogn Neurosci
 
14
:
757
–768.
Rumsey JM, Andreason P, Zametkin AJ, Aquino T, King C, Hamburger SD, Pikus A, Rapoport JL, Cohen R (
1992
) Failure to activate the left temporal cortex in dyslexia: an oxygen 15 positron emission tomographic study.
Arch Neurol
 
49
:
527
–534.
Rumsey JM, Donohue BC, Brady DR, Nace K, Giedd GN, Andreason P (
1997
) A magnetic resonance imaging study of planum temporale asymmetry in men with developmental dyslexia.
Arch Neurol
 
54
:
1481
–1489.
Salmelin R, Schnitzler A, Parkkonen L, Biermann K, Helenius P, Kiviniemi K, Kuukka K, Schmitz F, Freund H (
1999
) Native language, gender, and functional organization of the auditory cortex.
Proc Natl Acad Sci USA
 
96
:
10460
–10465.
Schulte-Körne G, Deimel W, Bartling J, Remschmidt H (
2000
) Speech perception deficit in dyslexic adults as measured by mismatch negativity (MMN).
Int J Psychophysiol
 
40
:
77
–87.
Shankweiler D, Crain S, Katz L, Fowler A, Liberman A, Brady S, Thornton R, Lundquist E, Dreyer L, Fletcher J, Stuebing K, Shaywitz S, Shaywitz B (
1995
) Cognitive profiles of reading-disabled children: comparison of language skills in phonology, morphology, and syntax.
Psychol Sci
 
6
:
149
–156.
Shtyrov Y, Kujala T, Palva S, Ilmoniemi R, Näätänen R (
2000
) Discrimination of speech and of complex nonspeech sounds of different temporal structure in the left and right cerebral hemispheres.
Neuroimage
 
12
:
657
–663.
Stevens KN (
1980
) Acoustic correlates of some phonetic categories.
J Acoust Soc Am
 
68
:
836
–842.
Studdert-Kennedy M, Mody M (
1995
) Auditory temporal perception deficits in the reading-impaired: a critical review of the evidence.
Psychon Bull Rev
 
2
:
508
–514.
Suga N, O'Neill WE, Manabe T (
1978
) Cortical neurons sensitive to combinations of information-bearing elements of biosonar signals in the mustache bat.
Science
 
200
:
778
–781.
Tallal P, Miller S, Fitch R (
1993
) Neurobiological basis of speech: a case for the preeminence of temporal processing.
Ann N Y Acad Sci
 
14
:
27
–47.
Tiitinen H, Sivonen P, Alku P, Virtanen J, Näätänen R (
1999
) Electromagnetic recordings reveal latency differences in speech and tone processing in humans.
Cogn Brain Res
 
8
:
355
–363.
Vihla M, Salmelin R (
2003
) Hemispheric balance in processing attended and non attended vowels and complex tones.
Cogn Brain Res
 
16
:
167
–173.
Vihla M, Lounasmaa O, Salmelin R (
2000
) Cortical processing of change detection: dissociation between natural vowels and two-frequency complex tones.
Proc Natl Acad Sci USA
 
97
:
10590
–10594.
Vouloumanos A, Kiehl K, Werker J, Liddle P (
2001
) Detection of sounds in the auditory stream: event-related fMRI evidence for differential activation to speech and nonspeech.
J Cogn Neurosci
 
13
:
994
–1005.
Wechsler D (
1981
) Wechsler adult intelligence scale — revised: manual. New York: Psychological Corporation. [Finnish translation, Psykologien Kustannus Oy, 1992.]
Wechsler D (
1987
) Wechsler memory scale — revised: manual. New York: Psychological Corporation. [Finnish translation, Psykologien Kustannus Oy, 1997.]
Wessinger C, Van Meter J, Tian B, Van Lare J, Pekar J, Rauschecker J (
2001
) Hierarchical organization of the human auditory cortex revealed by functional magnetic resonance imaging.
J Cogn Neurosci
 
13
:
1
–7.
Wiik K (
1965
) Finnish and English vowels. Turku: University of Turku.
Wolf M (
1986
) Rapid alternating stimulus naming in the developmental dyslexias.
Brain Lang
 
27
:
360
–379.
Wolf M, Obregon M (
1992
) Early naming deficits, developmental dyslexia, and a specific deficit hypothesis.
Brain Lang
 
42
:
219
–247.
Zatorre R, Evans A, Meyer E, Gjedde A (
1992
) Lateralization of phonetic and pitch discrimination in speech processing.
Science
 
256
:
846
–849.
Schormann T, Henn S, Zilles K (
1996
) A new approach to fast elastic alignment with applications to human brains.
Lecture notes in Comput Sci
 
1131
:
337
–342.
Woods RP, Grafton ST, Holmes CJ, Cherry SR, Mazziotta JC (
1998
a) Automated image registration: I: General methods and intrasubject, intramodality validation.
J Comp Assist Tomogr.
 
22
:
139
–152.
Woods RP, Grafton ST, Watson JDG, Sicotte NL, Mazziotta JC, (
1998
b) Automated image registration: II. Intersubject validation of linear and nonlinear models.
J Comput Assist Tomogr.
 
22
:
153
–165.