Spatiotemporal patterns of neuronal responses to asynchronous two-tone stimuli in the anterior field of the auditory cortex of anesthetized guinea pigs were studied using an optical recording method (12 × 12 photodiode array, voltage sensitive dye RH795). Interactions between the onset response to the first tone (masker; 5, 8, 10, 12 and 15 kHz, 200 ms) and to the second tone (probe; 10 kHz, 30 ms) with onset delays relative to the masker onset (0, 5, 10, 15 and 20 ms) were investigated. In general, two-tone interaction was suppressive rather than facilitative. At 0–10 ms probe delays, two-tone responses induced in the probe isofrequency area on the cortex tended to fuse with the masker response. At 15–20 ms probe delays, the probe response was apparently reduced, but was spatially focused and separated from the masker response. This spatial focusing of the probe response may have been due to neuronal inhibition originating after the masker onset response. These results are in agreement with psychoacoustical observations in human subjects, such as auditory segregation, and indicate that the spatial focusing of the cortical response provides a neuronal basis for detecting slightly asynchronous auditory inputs.
One of the aims of auditory physiology is to understand how components of a sound complex are identified, grouped together and segregated from other components. The onset time difference (asynchrony) of auditory events is a critical cue to segregate the components. Darwin and coworkers have examined the influence of an asynchronous constant-frequency component starting before the remaining components of a vowel on perceptual quality (Darwin, 1984; Hukin and Darwin, 1995; Darwin and Hukin, 1998). They showed that 30–40 ms asynchrony was sufficient for the listeners to remove the influence of that component. This finding suggests that one can segregate sound components even with an onset asynchrony of a few tens of milliseconds.
In auditory physiology, the effects of tone onset asynchrony on neuronal responses have been studied on the basis of two-tone masking, which refers to the reduction of neuronal response to a brief tone (probe) by the presence of another tone (masker). This reduction is assumed to be caused by at least three possible mechanisms. ‘Two-tone suppression’ has been thought to occur as a result of nonlinearities at the level of basilar membrane motion in the cochlea, because of many similarities between psychoacoustic tuning curves and frequency selectivity in the cochlea (Moore, 1986; Evans et al., 1989). Secondly, ‘short-term adaptation’ has been examined in forward (sequential) two-tone masking and is thought to originate in the auditory nerve as a result of poststimulatory reduction in response (Harris and Dallos, 1979; Abbas and Gorga, 1981; Finlayson and Adam, 1997; Duan and Canlon, 2001). Thirdly, ‘neuronal inhibition’ in the higher auditory centers has been shown to increase neuronal selectivity to tone frequency (Müller and Scheich, 1987, 1988; Spirou et al., 1999; Wang et al., 2000).
The mechanisms of auditory segregation caused by onset asynchrony have been explained by short-term adaptation in the peripheral auditory system (Ciocca and Darwin, 1993). If the neuronal response to the asynchronous (preceding) component is reduced by short-term adaptation, that component may scarcely contribute to the perceptual quality of the subsequent components. However, short-term adaptation accounts only for the case of onset asynchrony of a few hundred milliseconds, because if the onset delay of the subsequent component is shorter, its response will also be affected by short-term adaptation to the asynchronous component.
In this study, we used an optical imaging technique to analyse the interaction between two onset responses induced by two asynchronous tones in the guinea pig auditory cortex. The optical imaging technique is suitable for spatiotemporal analysis of asynchronous auditory events in the cortex. This technique is able to record neuronal activities in real time in multiple sites of the central nervous system (Salzberg et al., 1977; Orbach et al., 1985; Grinvald et al., 1986; Kamino et al., 1989; Taniguchi et al., 1992; Uno et al., 1993; Bakin et al., 1996; Horikawa et al., 1996; Laaris et al., 2000). Previous optical studies (Taniguchi et al., 1992; Uno et al., 1993) showed a dynamic frequency representation in the guinea pig auditory cortex which was identical with that obtained with a microelectrode technique (Redies et al., 1989; Arenberg et al., 2000; Wallace et al., 2000). Here we discuss the mechanisms of auditory segregation on the basis of dynamic neuronal population activity.
Materials and Methods
Seven adult guinea pigs weighing 280–420 g were used for optical recordings. The recording method was essentially identical to that described in detail elsewhere (Horikawa et al., 1996). The experiments were performed in accordance with the guidelines of the Animal Experiments Committee of Tokyo Medical and Dental University. Each animal was anesthetized with Nembutal (sodium pentobarbital, 30 mg/kg, i.p.). The heart rate, pupil reflex and electroencephalogram were monitored, and anesthesia was maintained by supplementary doses of Nembutal (10 mg/kg/h, i.p.) and neuroleptanalgesic (droperidol, 1.7 mg/kg/h and pentazocine, 0.5 mg/kg/h). The trachea was cannulated and the head was clamped by the incisors and the external auditory meatus in a metal frame using a hollow ear bar. A hole (~6 mm in diameter) was drilled in the left temporal bone and the dura and arachnoid membranes were removed. The animal was artificially respirated after inducing paralysis with pancuronium bromide (1 mg/kg/h, i.m.). A respirator supplied air to which oxygen (60%) was added. The auditory cortex was stained for 90 min with a voltage-sensitive dye, RH795 (0.2 mg/ml, dissolved in saline; Molecular Probes). The experiments were carried out in a dark sound-proof room. After the end of the experiment, each animal was killed with an overdose of Nembutal (90 mg/kg).
Figure 1 shows a schematic drawing of the optical recording system and recording locations in the cortex. The anterior (A) and partially dorsocaudal (DC) fields in the exposed auditory cortex were viewed with a measuring microscope (×5 magnification, NA 0.4, model LW-101, Hiland) which was focused 200 μm below the cortical surface (at layer II–III). The fluorescent signals emitted by the stained auditory cortex were long-passed at 620 nm and monitored with a 144 (12 × 12) channel photodiode array mounted on the microscope. The 144 signals were amplified ×2000, band-passed between 1 Hz and 2 kHz, and sampled with A/D converters (model MD-164, Narumi) at a rate of 0.576 ms per frame. Four hundred frames (230 ms) per trial were sampled and the optical responses for 10 sequential trials were averaged. Noise originating from heart pulsation was reduced by synchronizing the recording with the heart beat and by subtracting the recording without a stimulus from that with a stimulus. Noise originating from respiration was avoided by stopping the respirator for ~10 s during recording. The illumination was turned on only during the recording period to minimize dye bleaching.
Two pure-tone bursts were used for stimulation: masker (5, 8, 10, 12 and 15 kHz, 200 ms duration) and probe (10 kHz, 30 ms duration). Each tone was ~20 dB above the threshold of optical responses, with 10 ms rise–fall time. In four animals, the interaction between two onset responses was examined using the five masker frequencies described above. The probe onset delay ranged from 0 to 20 ms in 5 ms steps relative to the masker onset. Furthermore, in three animals, the effects of longer probe delays (30, 50 and 100 ms) were examined using 5 and 15 kHz masker frequencies.
The sound signals were produced by a computer (Gateway 2000), a sound editor system with A/D and D/A converters (Signal-RTS, Engineering Design), an attenuator (TPA-114A, Tamagawa Electronics) and a custom-made power amplifier. Two-tone pairs were presented to the right ear contralateral to the recording side through the hollow ear bar and a custom-made condenser earphone. The stimuli were presented at intervals of ~1 s, synchronized with the electrocardiogram. The sound pressure levels near the tympanic membrane were measured in situ at 360 frequencies between 0.1 and 99 kHz with a probe tube Brüel and Kjaer 4134 quarter-inch microphone (Murata et al., 1986) and expressed as dB SPL (sound pressure level re. 20 μPa).
Each optically detected activity was expressed as the ratio ΔF/F, where F and ΔF were the light intensity at rest and the change in intensity induced by neuronal responses, respectively. The data plotted as a 12 × 12 matrix of the optical signal traces were smoothed spatially by dividing a channel into 2 × 2 small pixels and interpolating the signal amplitude (Taniguchi et al., 1992). Statistical analysis was applied on the basis of the data averaged in four neighboring channels displaying a stronger peak response to the probe alone, and occasionally to each masker alone, in field A in the auditory cortex. In all statistical tests, differences were considered significant if P < 0.05.
Effects of the Masker on the Probe Response
Figure 2 shows typical optical signal traces for the probe alone and the masker–probe pair superimposed on the trace for the masker alone. At 0–10 ms probe delays, the onset response to the two-tone and that to the masker alone were roughly the same in shape, but the former was broad at 20 ms delay due to the probe response. While the probe response sometimes appeared to form a sharp peak, as shown at 15 ms delay, the changes were not always large as compared with signal noise.
Therefore, in order to characterize the probe response, the spatial response maps at the time of the maximal response to the probe alone were examined (Fig. 3). At 0–10 ms probe delays, the probe response tended to fuse with the masker response. At 15–20 ms delays, however, the probe response emerged distinctly in the 10 kHz isofrequency band in field A, with the exception of that for the 10 kHz masker. The spatial spread of the probe response was smaller than that of the response to the 10 kHz tone alone, i.e. the response to the probe presented >10 ms after the masker onset was ‘spatially focused’ relative to that to the probe alone. This spatially focused probe response appeared in the higher-frequency (dorsocaudal) side of the probe isofrequency band if the masker frequency was lower (5 or 8 kHz) and vice versa. If the frequencies of the two tones were close to each other (8 and 12 kHz masker conditions), the location of the focused probe response at 15 ms delay was unstable.
The focused probe response along the probe isofrequency band could be found in four animals at 20 ms probe delay in the 5 and 15 kHz masker conditions, in three animals at 15 ms probe delay in the 5 and 15 kHz masker conditions and in three animals at 20 ms probe delay in the 8 and 12 kHz conditions (Fig. 4). In all of the cases used for the analysis shown in Figure 4, the spatial spread of the probe response was significantly smaller than that of the response to the probe alone (one-tailed paired t-test, P = 0.006, 0.003, 0.027, 0.028, 0.006 and 0.006 from the left column to the right column). The spatial spread of the probe response at 20 ms probe delay was correlated with the separation between the frequencies of the two tones on the logarithmic scale (Spearman's ranked correlation, rs = 0.75, P = 0.004; n = 14).
We next examined the probe response at longer probe delays (Fig. 5). At 5 kHz masker frequency (1 octave separation from the probe), the probe response recovered as the probe delay increased from 30 to 100 ms. At 15 kHz masker frequency (0.58 octave), the probe response could be identified at 30 and 50 ms probe delays. However, the spatial focusing of the probe response could not be clearly observed until ~100 ms delay because the probe response was buried in high-frequency components. These components near the baseline of the optical signal traces may be signal noise, because the signal measured in each channel is the summed neuronal activity in a 250 × 250 μm2 cortical area and it is possible that spontaneous activities are canceled.
Response Characteristics in the Probe Isofrequency Band
With the increase of the probe delay, the two-tone response measured at the time of the maximal response to the probe alone at each delay was reduced relative to that to the probe alone (Fig. 6). This reduction was significant in all five masker cases in the 5–20 ms probe delay conditions (Welch's t-test, P < 0.05; n = 4). Within each probe delay condition, the plots of the response ratio were V-shaped and the response was most markedly reduced when the frequencies of the two tones were the same (10 kHz), with the exception of the 10 ms probe delay condition. This V-shaped curve appeared to be more sharply angled at 15 or 20 ms delay than at 5 ms delay. At 5 and 20 ms probe delays, the response ratio was correlated with the separation between the frequencies of the two tones on the logarithmic scale (Spearman's ranked correlation, rs = 0.38, P = 0.049 at 5 ms delay, rs = 0.55, P = 0.009 at 20 ms delay; n = 20).
The reduction of the probe response at 10 ms probe delay did not depend on the masker frequency, as shown in Figure 6. To identify the factor(s) on which the reduction of the probe response depended, the response ratio was plotted as a function of the distance between the masker and the probe isofrequency bands on the cortical surface (Fig. 7). While the probe response was reduced evenly upon each 0–5 ms probe delay, that at 10 ms depended significantly on the spatial separation between the two isofrequency centers (Kruskal–Wallis ANOVA, P = 0.027). A post hoc test revealed that the response amplitude in the group with a separation of 330–580 μm was significantly lower than that in the group with a separation of >580 μm (Scheffe's F test, P = 0.040). At 20 ms probe delay, the probe response was evenly reduced over all distances.
We next examined the response amplitude in the masker isofrequency band in order to compare it with that in the probe isofrequency band (Fig. 8). With the increase of the probe delay, the two-tone response in the masker isofrequency band was also reduced if it was measured at the time of the maximal response to the probe alone. For a given probe delay, however, the amplitude was not always proportional to that in the probe isofrequency band. A significant difference in response amplitude between the two isofrequency bands was found at 20 ms probe delay in the 5 (one-tailed paired t-test, P = 0.005; n = 4), 12 (P = 0.004) and 15 (P = 0.037) kHz masker conditions. These data show that a spatial contrast in response between the two isofrequency bands was produced as the probe delay increased.
Effects of the Probe on the Masker Onset Response
Two-tone interaction was examined by measuring the latency and amplitude of the maximal response to the two-tone pair in each masker isofrequency band (Fig. 9). When the frequencies of the two tones were different, neuronal two-tone interaction was suppressive rather than facilitative. This suppressive effect became stronger as the frequencies of the two tones became closer to each other (8 and 12 kHz masker conditions). When the frequencies were the same (10 kHz), the response rose a little more sharply and became stronger, regardless of the probe delay. In general, if the frequencies were different, the latency was somewhat shortened but the amplitude was reduced.
A similar suppressive two-tone interaction was also observed in the probe isofrequency band, although the differences among animals were larger (Fig. 10). The peak latency of the maximal response at lower masker frequencies (5 and 8 kHz) became longer as the probe delay increased. This shows that, in the probe isofrequency band, the probe was more effective than the masker and the probe response sometimes exceeded the masker response (see the 5 kHz masker condition in Fig. 5). At higher masker frequencies (12 and 15 kHz), the response amplitude was less suppressive than that in the masker isofrequency band. This was probably due to the influence of the high ratio of the response to those maskers, as shown in Figure 9B.
Using the optical recording method, we obtained results showing spatiotemporal two-tone interaction in the guinea pig auditory cortex. Our most interesting finding was that the cortical area activated by the probe was focused (or narrowed) as the probe delay increased from 0 to 20 ms. We will discuss this spatial focusing of the probe response to clarify its physiological mechanisms and to compare it to psychoacoustical phenomena in human subjects. Another important finding was suppressive two-tone interaction. We will discuss this in relation to previous microelectrode studies.
In a microelectrode study, the latency of the first spike in field A in the guinea pig auditory cortex ranged from 8.0 to 34.0 ms and the mean was 14.1 ms (Wallace et al., 1999), which were shorter than the optical responses measured in our study. The optical signal appeared with ~20 ms latency and reached a peak at ~27 ms. This discrepancy may be attributed to two factors: (1) the optical signal consists not of single spikes, but is, rather, a summation of the membrane potentials from active neurons in a 250 × 250 μm2 cortical area; (2) the optical signal reflects the intracellular membrane potential changes in the dendrites rather than those in the axons and cell soma (Grinvald et al., 1986).
Mechanisms of Spatial Focusing of the Probe Response
With the increase of the probe delay from 0 to 20 ms, the two-tone response in the probe isofrequency band was reduced if it was measured at the time of the maximal response to the probe presented alone at each delay (Fig. 6). At 15–20 ms probe delay, however, the probe response was spatially focused (Figs 3 and 4) and the response amplitude in the probe isofrequency band exceeded that in the masker isofrequency band (Fig. 8). This spatial focusing was thought to be induced mainly by neuronal inhibition, partly by short-term adaptation and not at all by two-tone suppression.
A recent optical study showed that inhibitory areas produced by γ-aminobutyric acid (GABA) surround the excitatory frequency bands, and that the resultant inhibition begins ~10 ms later than the onset of the response (Horikawa et al., 1996). Recent electrophysiological and immunocytochemical studies have indicated that the horizontal spread of GABAergic monosynaptic connections in cortical layer II–III is restricted to within ~500–600 μm (Salin and Prince, 1996; Tanigawa et al., 1998). At 10 ms probe delay, we observed that the amplitude of the probe response depended on the spatial separation from the center of the masker receptive area (580 μm) rather than on the separation between the frequencies of the two tones (Fig. 7). Therefore, neuronal inhibition may filter the response widely around the masker receptive area and cause the probe response to be focused spatially.
Short-term adaptation is caused by the depletion of immediate neurotransmitter stores (Furukawa et al., 1982). This adaptation is expected to block short-latency (0–4 ms) postsynaptic potentials (Duan and Canlon, 2001) and to last for a few hundred milliseconds (Smith, 1977; Harris and Dallos, 1979). In the cat peripheral auditory system, forward (sequential) two-tone masking occurs proportionally with the amplitude of the preceding masker response (Smith, 1977; Harris and Dallos, 1979; Abbas and Gorga, 1981). In the cat auditory cortex, however, the probe response was reduced independently of the preceding masker response (Calford and Semple, 1995; Brosch and Schreiner, 1997). This difference indicates that the probe response is affected in a complex way by short-term adaptation of excitatory and inhibitory responses to the masker in a series of higher auditory centers (Finlayson and Adam, 1997).
Two-tone suppression in the cochlea is not related to the spatial focusing of the probe response. Previous microelectrode studies have determined the conditions of the masker frequency and intensity in which two-tone suppression occurs in the cat primary auditory nerve (Galambos and Davis, 1944; Sachs and Kiang, 1967; Arthur et al., 1971). If the masker intensity is the same as the probe intensity, as was the case in our study, two-tone suppression scarcely occurs regardless of the frequencies of the two tones. Moreover, the effects of two-tone suppression do not change with the increase of the probe delay, because two-tone suppression is a mechanical phenomenon in the cochlea.
Suppressive Two-tone Interaction
Previous microelectrode studies showed both suppressive and facilitative effects of simultaneous two-tone stimuli on responses in the auditory cortex of the monkey (Shamma and Symmes, 1985) and the cat (Sutter and Schreiner, 1991; Nelken et al., 1994; Sutter et al., 1999). Shamma and Symmes found a single neuron displaying inhibitory effects by which the response to the first tone was eliminated, and another neuron displaying a two-fold increment in response relative to that to the first tone alone (Shamma and Symmes, 1985). Those influences were observed even if the frequencies of the two tones were an octave apart, although the intensity of the second tone was 12 dB higher than that of the first tone.
Based on the neuronal population activity, however, we showed that two-tone interaction was suppressive rather than facilitative (Figs 9 and 10). For the probe isofrequency band, we observed that the spatially focused response appeared dorsocaudally if the masker frequency was lower than the probe frequency and vice versa (Fig. 3). This indicates a topographic map of suppressive effects changing along the isofrequency band; the two-tone response in the dorsal area of the probe isofrequency band may be reduced more strongly by the lower-frequency maskers than by the higher-frequency maskers and vice versa. A similar topographic map of suppressive effects has been observed in the auditory cortex of the cat (Sutter and Schreiner, 1991) and the ferret (Shamma et al., 1993).
The suppressive effect was observed not only in the probe isofrequency band (Fig. 10), but also in that of the masker (Fig. 9). This paradoxical backward effect of the probe, which weakened the masker onset response, may be explained by the difference between the peak latency of the optical signal of 22.7–31.6 ms and the spike latency of 8.0–34.0 ms (Wallace et al., 1999). The spike for the probe that is applied in a few tens of milliseconds after the masker onset may be able somehow to catch up with the masker onset response somewhere in the auditory pathway. This suppressive effect of the probe on the masker response is thought to enhance the separation of the probe response from the masker response.
Comparison with Psychoacoustic Observations in Human Subjects
Hirsh reported that if a listener was asked to judge the temporal order of two tones, each of 500 ms duration, the minimum onset asynchrony that he or she could discriminate was 15–20 ms (Hirsh, 1959). This value is consistent with that of the present data in which spatial focusing of the probe response in the cortex was observed at 15–20 ms probe delays. We could not identify the spatial focusing of the probe response at 30–50 ms delays in the 15 kHz masker condition, but this may be due to the high noise level near the baseline of the optical signal traces in our system. These results suggest that spatial focusing of neuronal responses induced by subsequent tones is a neuronal basis for detecting those asynchronous tones. The value of 15– 20 ms was also found when visual and tactile modalities were employed and is therefore considered to be a fundamental limit for discriminating the order of stimuli in the nervous system (daddyHirsh and Sherrick, 1961).
The spatial spread and amplitude of the probe response depended on the masker frequency (Figs 4 and 6). It has been shown (Darwin and Hukin, 1998) that an asynchronous 500 Hz frequency component was segregated from subsequent first (396–521 Hz), second (2100 Hz) and third (2900 Hz) formant frequencies during the onset asynchrony increment from 0 to 40 ms. If the neuronal responses to subsequent components are spatially focused in the human auditory cortex, the spatial spread and amplitude of the response to the first formant frequency would be more markedly reduced than those to the second and third formant frequencies. Nevertheless, those three formant frequencies could be grouped and segregated from the asynchronous component by the listener. These findings and their implications indicate that the spatial focusing of neuronal responses and the suppressive interaction of the two onset responses observed in our study contribute to auditory segregation. The perceptibility of slightly asynchronous tones may be influenced not by the absolute amplitude of the response in each isofrequency area in the auditory cortex, but by the contrast in neuronal response between the isofrequency area and its surrounding area.
This work was supported by Grants-in-Aid for Scientific Research 09671736 and 11680783, from the Ministry of Education, Science, Sports and Culture, Japan.