The aim of this study was to determine whether auditory perceptual learning is associated with changes in the frequency organization and/or neuronal response properties of primary auditory cortex (AI). Five out of six cats trained on an 8 kHz frequency discrimination task showed improvements in performance that reflected changes in discriminative capacity. Quantitative measures of the response characteristics and frequency organization of AI revealed that the frequency organization of AI in trained cats did not differ from that in controls, but there was a tendency for neurons with a CF immediately above 8 kHz to have slightly broader tuning in the trained cats than in controls, and neurons in one of these bands had significantly shorter latency. These results are in accord with recent reports that cortical topography in primary visual cortex is unchanged in animals trained on visual discrimination tasks, but are at variance with an earlier report of enlarged representations of training frequencies in AI of monkeys trained on a frequency discrimination task. It is concluded that substantial changes in perceptual discriminative capacity can occur without change in primary cortical topography and with only small changes in neuronal response characteristics.
Plasticity in the receptive field (RF) properties of cortical neurons and in the functional organization of primary sensory cortices of adult animals has been described in numerous studies over the last 20 years (for reviews see e.g. Weinberger, 1995; Buonomano and Merzenich, 1998; Gilbert, 1998; Kaas, 2000). Such plasticity occurs as a consequence of altered patterns of input produced by either peripheral injury or specific training procedures. More recently, it has been proposed that improvements in perceptual performance with training (perceptual learning) might reflect plasticity at early stages of cortical sensory processing. This proposal has commonly been based on indirect evidence, notably the specificity of improvements in some perceptual discriminations to a particular parameter of the training stimulus or to the region of the receptor surface to which the training stimuli are presented (e.g. Ramachandran and Braddick, 1973; Karni and Sagi, 1991; for a review see Karni and Bertini, 1997). Changes in primary sensory cortex are not the only possible explanation of such specificity (Mollon and Danilova, 1996; Dosher and Lu, 1999), however, and direct evidence from studies of cortical response characteristics and organization in trained animals has been equivocal. Three recent studies of the response characteristics of neurons in primary visual cortex (V1) of monkeys exhibiting perceptual learning on various discrimination tasks have shown either no changes or relatively subtle changes in RFs and no changes in cortical topography (Crist et al., 2001; Schoups et al., 2001; Ghose et al., 2002). In contrast, changes in neuronal response characteristics and in cortical topography in primary somatosensory (S1) and auditory (AI) cortices in monkeys have been described in association with perceptual learning in those modalities (Recanzone et al., 1992, 1993).
In the auditory study, Recanzone et al. (1993) reported that in owl monkeys trained on a frequency discrimination task the area of representation of the training frequencies in AI was enlarged (by a factor of ∼7 to ∼9, depending on frequency) relative to untrained (control) monkeys, and that within the group of trained animals, larger areas of representation were associated with superior discrimination performance. The frequency tuning of multi-neuron clusters in these enlarged areas of representation was sharper, and response latency was longer, relative to that in the areas in which the same frequency ranges were represented in control animals. Expansion of the representation of training frequencies must presumably occur at the expense of the representation of adjacent frequencies. In accordance with this expectation, Recanzone et al. (1993) reported that in two animals in which frequency discrimination was tested using a range of frequencies above the target frequency (+ΔF discrimination) there was a decreased cortical representation of a range of frequencies below the target frequency.
In experimental studies, improvement in performance on a perceptual task is likely to involve improvement in performance of the task used to assess perceptual performance (commonly termed procedural learning) in addition to possible improvement in perceptual discrimination per se (perceptual learning) (Robinson and Summerfield, 1996). Although these two forms of learning presumably occur concurrently, procedural learning is generally assumed to be reflected in a rapid initial phase of improvement in performance, while perceptual learning is assumed to depend on extensive training and to be a more gradual process. Procedural learning would also be expected to generalize to any other perceptual discrimination tested with the same procedures, while the specificity or generalizability of perceptual learning is an empirical issue. In accordance with these views, Recanzone et al. (1993) reported that a fast initial phase of improvement in the performance of their monkeys on the auditory frequency discrimination task, which they described in terms of conceptual learning of the task, generalized to other (untrained) frequencies. In contrast, the slower gradual phase of learning did not generalize, but was associated with a decrement in performance at adjacent frequencies.
We have studied the effect of perceptual learning at one training frequency on the representation of that frequency in AI of cats that showed perceptual learning and reached asymptotic performance levels on a frequency discrimination task. In order to establish the specificity of perceptual learning on this task, we also examined the extent to which improvements generalized to distant and adjacent frequencies. Although we found some evidence suggestive of small changes in the breadth of frequency tuning and the response latency of neurons in the cortical area of representation of the training frequencies, we found no evidence of changes in the area of cortex devoted to these or adjacent frequencies, or of decrements in performance at distant or adjacent frequencies. A preliminary report of some of these data has been published (Brown et al., 2002).
Materials and Methods
Six domestic cats trained on the frequency discrimination task were housed together from 8 to 12 weeks of age, and were handled daily and given ad lib food and water up until the beginning of training (at 12 weeks). During the training period cats were weighed daily and fed measured quantities of dry food after their training session and on days which they were not trained, so that they maintained, or gained in, body weight. Four (untrained) cats used as controls were tested electrophysiologically at between 6 and 12 months of age. All procedures carried out on these animals were approved by the Monash University Department of Psychology Animal Ethics Committee.
The effects of perceptual learning on the frequency organization of AI were examined using a +ΔF frequency discrimination task at 8 kHz, because this frequency was one of those used by Recanzone et al. (1993) and in most cats is represented on the surface of the middle ectosylvian gyrus (MEG), where accurate mapping is possible. In most cats, lower frequencies are represented in the rostral bank of the posterior ectosylvian sulcus (PES), in which it is much more difficult to map with the required degree of precision the area of representation of a particular frequency. One group of trained cats (n = 3) was initially trained to a near-asymptotic level on the frequency discrimination task at 3 kHz to ensure that procedural learning was complete before training was commenced at 8 kHz. This procedure also enabled an assessment of any effect on performance at this frequency of the subsequent training and perceptual leaning at 8 kHz. The other group of trained cats (n = 3) was extensively trained on only the 8 kHz task and in the course of training was tested on interpolated trials on discrimination of 8 kHz using lower-frequency stimuli (a –ΔF task) and on discrimination at 3 kHz.
The psychophysical training methods were based on those reported by May et al. (1995), with some modification of stimulus presentation parameters. At 12 weeks of age, each cat was trained on an auditory association task by giving it small amounts of a highly desirable food whenever an auditory click was presented. The cat’s behaviour was then shaped such that it was required to perform a bar press, which initiated the presentation of the click followed by the food reward. This behaviour was further shaped into a hold–release paradigm (see below), suitable for use on an operant platform, as described by May et al. (1995).
All frequency discrimination training was carried out in a sound attenuated room (>70 dB at 8.0 kHz). Training began when the cat could sit on the operant platform and perform the hold–release task. The cat initiated a trial by pressing and holding down a large paw pedal. At a randomly variable time (between 1 and 4 s) after initiation of the bar press, a sequence of comparison tones (S1), randomly variable in number between 2 and 8, was presented. The S1 tones were either 8.0 or 3.0 kHz tone pips, 600 ms in duration with 5 ms rise/fall intervals, and a 400 ms inter-stimulus interval. All tonal stimuli were generated by Tucker Davis Technologies System 2 hardware, which was controlled by custom designed software. A difference tone (S2), different in frequency but with otherwise the same stimulus parameters as the S1 tones, was presented 400 ms after the offset of the last S1 stimulus. The cat was required to release the paw pedal on detection of the S2 stimulus to receive a food reward delivered through a spout positioned directly in front of and at the level of its mouth. The food reward (1–4 ml) consisted of liquidized and filtered tinned cat food diluted 50% in water. The cat was restrained on the operant platform such that if it did not have its head in a forward position with its mouth directed at the food spout, the reward would fall to a collecting dish out of reach. The spout position ensured that the cat’s head was directed toward the speaker positioned 1 m directly in front of and ∼20° above the cat’s interaural horizontal plane. The position of the speaker was elevated to reduce the filtering effects of the outer ear (Rice et al., 1992).
The cat was required to release the paw pedal in the interval 20–1000 ms after the onset of the S2 stimulus. Releases that occurred between the start of the first S1 tone pip and 20 ms after the start of the second S1 tone pip ended the trial, and were recorded as pre false positives (pre-FP). Releases after the pre-FP period and prior to 20 ms after the start of the S2 stimulus ended the trial and were recorded as false positives (FP). Releases after the FP period and prior to 400 ms after the cessation of the S2 stimulus completed the trial and were recorded as hits. Non-releases and releases after the hit period were recorded as misses. During the early part of the frequency discrimination training, and at various times throughout the training, pre-FP, FP and misses were followed by a time-out period of 4 s, during which another trial could not be initiated. Animals were video-monitored during training to ensure they were attentive to the task and not distressed.
All discrimination training was carried out with S2 at a higher frequency than S1 (i.e. with +ΔF), but as described earlier, some interpolated 8 kHz test trials were carried out using S2 values lower than S1 (–ΔF). The frequency range over which S2 varied, and the intervals between S2 values, were adjusted throughout each cat’s training so that performance within a session was maintained near 70–90% correct. During each session, response performance across all S2 stimuli was monitored online so that time-out periods and reward volume could be adjusted to help maintain the cat’s performance. Cats performed between 50–300 trials per session; any training session in which a cat completed considerably less trials than its daily average was repeated later in the day. Cats were trained for 9–12 months with one or two sessions per day, 3–5 days a week, up to the day prior to electrophysiological mapping of AI.
The stimulus sound pressure level (SPL) was sampled at 1.0, 1.05 and 1.1 m from the free-field speaker (Vifa, Model P25WO-00) using a Brüel and Kjær 0.5 inch condenser microphone and measuring amplifier (Brüel and Kjær Type 4133 and Type 2606). These locations encompassed the range over which the cat’s head could be located while attending to the stimuli. The SPL of the stimuli across the three locations varied by no more than 5 dB at any given frequency in the range of S1 and S2 frequencies used in training. At a given training frequency, the average of the SPLs across the three locations was entered into a calibration table used by the computer to generate stimuli at the required SPL. To eliminate any cues for discrimination based on loudness differences between stimuli, the stimulus SPL was randomly varied over the range of 60 ± 5 dB.
The number of S1 repeats within a trial varied randomly between 2 and 8. The probability of the S2 occurring after any of the S1 stimuli was equal and remained constant for all training sessions. The probability of chance performance for the modified variable hold–release paradigm therefore decreased with successive S1 stimuli within a trial. The overall chance level of performance for this training paradigm is 0.14. For the determination of minimum discrimination thresholds a 0.5 critical level of performance was set, which is well above chance, and therefore gave a conservative estimate of minimum discrimination thresholds.
Sessions were divided into blocks comprising an approximately equal number of trials for the analysis of frequency discrimination, which was measured in terms of the just detectable frequency difference (ΔF). Psychometric functions were plotted from the hit rate for each S2 after correction for the FP rate. The hit rate is defined as the number of hits for each S2 divided by the number of trials on which the S2 was presented. The FP rate served as an estimate of the cat’s guess rate and as an indicator that cats were under stimulus control. The total FP rate (FPtr) is the number of false positives within a block of trials divided by the total number of S1 presentations in that block. The FP correction is therefore 1-FPtr. Discrimination performance was then calculated by multiplying the hit rate for each S2 stimulus by 1-FPtr. Four-parameter sigmoidal functions (Hill functions) were fitted to the discrimination functions (see Fig. 1). The parameters from the sigmoidal function were then used to calculate the frequency at a performance level of 0.5.
Procedures for mapping AI were generally as described previously (e.g. Rajan et al., 1993) except that rather than mapping the entire frequency representation a fine-resolution map of the region over which neurons responded to 8 kHz was obtained. Cats were fasted for 18 h prior to the induction of surgical anesthesia with sodium pentobarbitone (Nembutal, @ 45 mg/kg). Throughout surgery and during recording, supplemental doses of anesthetic (Nembutal) were delivered via a slow i.v. infusion pump at a rate of 2–5 mg/kg/h or as required. The level of anesthesia was monitored using ECG, pedal reflex and ocular dilation, and the cat’s core body temperature was maintained at around 37.5°C using a DC heating pad controlled by a rectal thermistor. Following tracheal cannulation the cat was placed in a head holding frame, and the bullae were opened to allow stainless-steel spring electrodes to be placed on the round windows. Polyethylene tubes (0.38 mm I.D.) were placed into the bullae to allow static pressure equalization, and the bullae were then sealed with dental cement. Auditory compound action potential (CAP) audiograms were obtained by measuring the N1 threshold for stimulus frequencies between 2 and 40 kHz (5 ms duration; 0.4/3.0 ms rise/fall interval), using signal averaging (10 ± 1 µV criterion). The left auditory cortex was exposed and a calibrated digital photograph of the cortex obtained. The position of each electrode penetration was plotted onto the photograph based on cortical vasculature. The x–y coordinate of each penetration could then be determined from the coordinate of the central pixel under each penetration position. A modified Davies chamber with its top parallel to the gyral surface was positioned over the exposure, secured to the skull, filled with sterile saline, and sealed with a glass plate. Electrode penetrations were thus made approximately orthogonal to the surface of the MEG, using a hydraulic micromanipulator mounted in the glass plate on top of the Davies chamber. Recordings were made using 1–2 MΩ glass-insulated tungsten microelectrodes. The electrode was advanced 500–700 µm into the cortex before searching for tone-evoked multi-unit (neuronal cluster) activity. Once a cluster containing clearly-defined action potentials was obtained, the characteristic frequency (CF; frequency at which threshold is lowest) was determined audiovisually (just detectable increase in spike rate, ±5 dB SPL; ±0.1 kHz). A response area was obtained for each cluster by presenting, under computer control, a frequency–intensity matrix (FI) that was centred about the audiovisually determined CF and varied over an SPL range from below threshold to ∼60 dB suprathreshold. The FI stimuli were 50 ms pure tone bursts with 5 ms rise/fall intervals, presented at a rate of 2 Hz. Frequency–intensity combinations were presented pseudorandomly across the matrix, and the complete matrix was presented five times in a different pseudorandom order. For the collection of these quantitative data, a Schmitt trigger was set at a level well above the noise floor and exceeded only by clearly defined action potentials, which produced Schmitt trigger output pulses timed by the computer with an accuracy of 10 µs. The software reported the number of spikes within a specified count window and the mean first-spike latency at each FI combination. The CF, best frequency (BF; the frequency which elicits the largest number of action potentials at an intensity within the FI matrix range), Q20 (CF/bandwidth 20 dB above CF threshold) and the CF-L20 (latency 20 dB above threshold at CF) were obtained from the response area. Following the FI, an 8 kHz input–output function, over a 70–80 dB range, was obtained for each cluster.
The psychometric frequency-discrimination functions from which threshold measures were derived are illustrated in Figure 1 by the functions for selected blocks of trials throughout the period of 8.0 kHz training for cat 01-19. The threshold (50% performance) delta frequency (ΔFt) decreases over the period of training. This is reflected in an increase in the slope of the psychometric function, with the greatest change occurring in the early period of training. The frequency discrimination thresholds over the training period of all cats are shown in Figure 2. For all cats, the initial few thousand trials in which the animals were learning the task were collected over many session and did not generate measurable discrimination thresholds. For the three cats (00-11, 00-21 and 00-22: Fig. 2D, E and F, respectively) given extensive training on the 3.0 kHz task prior to training at 8.0 kHz, only the data for the later stages of training at 3 kHz, at which performance levels were near-asymptotic, are shown. The transfer from 3 to 8 kHz required the presentation of discrimination trials at intervening S2 frequencies, which did not generate measurable discrimination thresholds; hence the gap in the frequency discrimination threshold functions for these animals.
The rate of initial improvement in performance on the 8 kHz task, and the final minimum threshold attained, varied between and within animals in the two groups of trained cats. For the three cats trained only on the 8 kHz task (Fig. 2A–C), frequency discrimination performance showed substantial and sustained improvement over the initial 10 000–15 000 trials, and then plateaued at a level that was maintained for the remainder of the training period (i.e. reached asymptotic levels). The similarity in the form of the learning curves of the three cats is more clearly illustrated in the normalized curves presented in Figure 3A–C. The period of improvement in the performance of these animals could reflect procedural and/or perceptual learning. However, the fact that little or no improvement was seen for any of the cats over the first 20% of training trials at 8 kHz suggests that procedural learning must have occurred during the 5000 or so trials prior to those from which threshold determinations were derived, and that the improved performance shown in Figures 2A–C and 3A–C reflects perceptual learning. This interpretation is supported by the results of signal detection theory analysis presented below.
The 8 kHz performance of the three cats initially trained at 3 kHz was much less consistent. For one of these cats (00-11; Fig. 2D), the 8.0 kHz thresholds were highly variable in the blocks immediately after the frequency transfer and subsequently plateaued at a level no lower than those early in the training at that frequency. This animal therefore showed no learning at 8 kHz. The second cat (00-21; Fig. 2E) shows an improvement in threshold in the course of training at 8.0 kHz, most of which took place in the first few thousand trials. The third cat in this group (00-22; Fig. 2F) showed a sustained improvement in threshold that was comparable to that of the cats in the 8 kHz-only group, but most of this improvement occurred in the first 500–1000 trials. The fact that performance on the 3 kHz task of all three cats in this group was at asymptotic levels suggests that both procedural learning and perceptual learning were complete prior to the frequency transfer. If this is the case, the improvement at 8 kHz shown by cats 00-21 and 00-22 reflects perceptual learning at this frequency and indicates that although the perceptual learning at 3 kHz generalized to 8 kHz, this generalization was not complete (i.e. the perceptual learning at 3 kHz was partially specific to that frequency).
For each of the five cats whose thresholds at 8 kHz improved with training, false alarm rates did not change systematically in the course of training but remained approximately constant (varying in the range 0.6–3.8%), indicating that the improvement did not reflect a change in criterion. As a further assessment of whether the improvement reflected a change in perceptual sensitivity, d′ values were calculated for each cat for the discrimination of 8 from 10 kHz (i.e. for S2 = 10 kHz) for a block of trials early and at the end of training. These data are presented in Figure 4. For each of the cats, d′ increased over the training period, indicating an increase in sensitivity to that frequency difference. Although this analysis is qualified by the fact that the calculation of d′ from a single pair of hit and false alarm rates assumes that the signal and noise distributions are normal and of equal variance (Gescheider, 1997), it supports the interpretation of the improved performance as reflecting perceptual learning. The fact that the increase in d′ was larger in the animals trained only at 8 kHz (mean increase = 0.87) than in the animals trained first at 3 kHz (mean increase = 0.28) is also in accordance with the view that in the latter group perceptual learning was complete at 3 kHz and had generalized to a large extent to 8 kHz, so that less perceptual learning at 8 kHz occurred in this group.
Mean normalized learning curves for the 8 kHz-only group and for the two animals in the transfer group that showed learning at 8 kHz are shown in Figure 3D and G, respectively. These mean functions further illustrate the differences in the time course of learning and the magnitude of improvement in the two groups. They also illustrate the fact that the asymptotic performance levels achieved by the two groups differed substantially.
Further evidence on the specificity of improvements in discrimination to the training frequency and on the effects of training at one frequency range on discrimination at other frequencies is provided by the data derived from the blocks of test trials on a 3.0 kHz and/or a –ΔF 8 kHz discrimination task that were interpolated in the 8 kHz training period. The number of interpolated trials on these tasks was kept to the minimum needed to obtain an accurate determination of performance. All cats tested showed an improvement in performance on the 3.0 kHz task throughout the training period. For those cats initially trained to asymptotic levels at 3 kHz (Fig. 2E,F), the trials interpolated late in 8 kHz training showed a further slight improvement in performance. For the three cats that were not initially trained at 3 kHz (Fig. 2A–C), performance on the interpolated trials was similar to that of the better-performing animals trained at 3 kHz (Fig. 2E,F) and improved across the two blocks of interpolated trials. The mean improvement in ΔFt for the interpolated 3.0 kHz task was 0.33 kHz (0.15 octaves).
For cats that were tested on the –ΔF 8 kHz task (01-18, 01-19 and 01-20), the trials occurred early and toward the end of the training period (Fig. 2A–C). All three cats showed an improvement in discrimination performance from the initial to the final –ΔF testing, and performance on –ΔF trials after training with +ΔF was comparable (Fig. 2A) or superior (Fig. 2B,C) to that on +ΔF trials. The mean improvement on the –ΔF 8 kHz task was 0.62 kHz (0.11 octaves).
These observations provide no evidence for a decrement in performance at other frequencies as a consequence of training and improvement in discrimination thresholds at 8 kHz, and indicate that learning at that frequency has in fact generalized substantially to the other frequencies tested.
The CAP audiograms for all of the trained cats were within the normal range (Rajan et al., 1991). Tonotopic maps of AI of sufficient detail to provide information on cortical map changes were obtained from five trained cats and four normal (untrained) cats. The number of CF determinations made in AI of one trained cat (01-18) did not allow isofrequency contours (see text below) to be fitted to the penetration points and a quantitative map could therefore not be generated for this animal. The cortical map and neuronal response property data obtained from trained cat 00-11, which did not show perceptual learning at 8 kHz, have not been included in the group analyses for the trained cats. The tonotopic maps cover an area that extends 3–4 mm rostrocaudally and 4–5 mm mediolaterally over AI, with a frequency range from ∼4 kHz to 16 kHz. Some maps were limited on their caudal, low frequency edge by the PES. The number of penetrations in each map ranged from 40 to 88, with a mean of 71. This produced a sampling resolution of 250–500 µm across AI in the frequency range of interest.
The CF cortical map from an untrained animal is shown in Figure 5A. Isofrequency contours were fitted to the CF cortical maps using an inverse distance method (SigmaPlot, interpolated 3-D mesh plot; X/Y intervals = 15; distance weight = 6) with contour intervals set at 0.5 kHz. The isofrequency contours in Figure 5A range from 6.25 to 12.75 kHz and run mediolaterally across the cortex. The orientation of the CF axis in this map is 10.9° from the rostrocaudal plane, and across all other animals it varied between 7.3° and 26.4°. In determining the area representing these 0.5 kHz bandwidths, the lateromedial extent of the cortical map was limited by a 3 mm wide band, parallel to the frequency axis and centred on the region of the map where the isofrequency contours were approximately parallel. The area between contour lines (representing 0.5 kHz bandwidths) and within the 3 mm band was calculated from the number of underlying pixels in the calibrated digital photograph. The area of each 0.5 kHz CF band in cat 01-17 is shown in Figure 5B. It is apparent that the area of these 0.5 kHz bands varies substantially, in this case by a factor of almost 3 (compare bands centred on 7.5 and 10.5 kHz). Similar but idiosyncratic variation in the area of frequency band strips was seen in all normal cats.
Figures 6A and 7A show the CF cortical map and fitted isofrequency contours for two trained cats, 00-22 (trained at 3 and 8 kHz) and 01-19 (trained at 8 kHz only). For cat 00-22, the isofrequency contours ranging from 5.75 to 14.25 kHz, run mediolaterally across AI, with the CF axis oriented at 8.6° from the rostrocaudal plane. Areas of the 0.5 kHz bands in this cat (Fig. 6B) show a similar degree of variation to that seen in the untrained cat (Fig. 5B), with up to a threefold difference in the area of individual bands. The CF isofrequency contours for trained cat 01-19 (Fig. 7A) are similar in orientation and spacing to those of both the untrained cat (Fig. 5A) and the trained cat for which data are shown in Figure 6A. Note that there does not appear to be any increase in the area of the 0.5 kHz bands near the discrimination frequency of 8.0 kHz for either of the trained animals.
Data similar to those presented in Figures 5 and 6–7 were obtained in the other untrained and trained cats, respectively. To determine if there was a difference between the trained and untrained cats in the distribution of area across the 0.5 kHz CF bands, the mean area within each band was calculated for each group, and these data are shown in Figure 8. Although there is variation in the size of the bands at different frequencies, there is no suggestion of any systematic difference between untrained and trained cats. To determine whether there were any differences in the area of AI representing the range of frequencies used in the +ΔF 8 kHz training task, the area of cortex within the CF isofrequency contours ranging from 7.75 to 11.75 kHz was measured and compared across the untrained and trained cats. An ANOVA of the data presented in Figure 8 for this frequency range showed no significant difference between groups (F = 0.01; df = 1; P > 0.05) or across frequency (F = 0.97; df = 7; P > 0.05), and no significant interaction between these two factors (F = 0.31; df = 7; P > 0.05). To compare the area of AI representing frequencies that were used in tests on the –ΔF 8 kHz task, the area of cortex between CF isofrequency contours ranging from 6.25 to 7.75 kHz was measured. An ANOVA of the data presented in Figure 8 for this frequency range showed no significant difference between groups (F = 1.15; df = 1; P > 0.05) but did reveal a significant difference across frequency (F = 4.47; df = 2; P < 0.05). The interaction between these two factors was not significant (F = 0.06; df = 2; P > 0.05) indicating that the significant frequency effect was the same for both groups. Although all the trained cats that showed perceptual learning had reached asymptotic performance levels at 8 kHz, it is possible that the different training schedules had affected the CF cortical representation of the 8 kHz training frequencies. To investigate this possibility, a further analysis with training group as a factor was carried out. The ANOVA showed no significant difference between groups for either the +ΔF 8 kHz training frequencies (7.75–11.75 kHz; F = 0.80; df = 2; P > 0.05) or the –ΔF 8 kHz training frequencies (6.25–7.75 kHz; F = 1.24; df = 2; P > 0.05).
The data presented thus far indicate that there is no significant effect of frequency discrimination training on the frequency organization of AI as defined in terms of the distribution of CFs. However, the cats were trained using SPLs well above threshold (namely, 60 ± 5 dB), and it is possible that neuronal responses to suprathreshold stimuli might have changed without any changes in CFs. As one way of examining this possibility, isofrequency contours were plotted for the distributions of the best frequency (BF) of AI clusters in four untrained and four trained cats. Figure 9A shows the BF map and isofrequency plot for the untrained cat 01-17, the CF map for which was presented in Figure 5A. The frequency axis of the BF map is similar in orientation to that of the CF map, i.e. 9.0° from the rostrocaudal plane. The range in orientation of the frequency axis across all BF maps was 4.3–22.9° from the rostrocaudal plane. The greatest difference in orientation of the frequency axis between the CF and BF maps within one animal was 6.5° with the mean magnitude difference being 3.4°. The areas of the 0.5 kHz BF bands in cat 01-17 are shown in Figure 9B. The area within the bands varies by up to threefold across frequency, similar to the variation seen for the CF maps.
The orientation of BF isofrequency contours for trained cat 01-19 (Fig. 10A) is also similar to that of its CF contours (Fig. 7). The area of the 0.5 kHz BF bands (Fig. 10B) again shows up to a threefold difference across frequency. In this animal, the area of the band centred on 8.0 kHz is larger than that of other bands around it. However, this difference in area is no greater than that seen in untrained animals.
Figure 11 shows the mean area for each BF frequency band in the untrained and trained groups. To determine whether there were any differences in the area of AI, defined in terms of cluster BF, representing the range of frequencies used in the +ΔF 8 kHz training task, the area of cortex between BF isofrequency contours ranging from 7.75- to 12.25 kHz was measured and compared across the untrained and trained cats. An ANOVA of the data presented in Figure 11 for this frequency range showed no significant difference between groups (F = 0.85; df = 1; P > 0.05) or across frequency (F = 0.29; df = 8; P > 0.05), and no significant interaction between these factors (F = 1.07; df = 8; P > 0.05). These data indicate that training did not affect the area of AI representing the training frequencies, defined in terms of BF. Similarly, for those frequencies in the BF maps that were used in testing on the –ΔF 8 kHz task, the area of cortex within the 6.75–7.75 kHz BF isofrequency contours was measured for three untrained and four trained cats. An ANOVA of the data presented in Figure 11 for this frequency range showed no significant difference between groups (F = 1.02; df = 1; P > 0.05) or across frequency (F = 0.54; df = 1; P > 0.05), and no significant interaction (F = 0.26; df = 1; P > 0.05). A further ANOVA was again carried out to determine whether there was any difference between the two trained groups. The ANOVA showed no significant difference between groups for either the +ΔF 8 kHz training frequencies (7.75–12.25 kHz; F = 1.04; df = 2; P > 0.05) or the –ΔF 8 kHz training frequencies (6.75–7.75 kHz; F = 1.98; df = 2; P > 0.05).
A further possibility is that frequency discrimination training might result in a change in the organization of the AI frequency map that is specific to the particular frequency–SPL combination characterizing the discrimination stimulus (namely, 8.0 kHz at 60 ± 5 dB in our training paradigm). To estimate the region of AI activated by a 60 dB 8 kHz stimulus, the threshold at 8 kHz of each cluster was derived from its 8.0 kHz input–output (I/O) function, and iso-threshold contours were derived for the area used in the CF area determination (namely, the 3 mm dorsoventral band between the 6.25- and 11.75 kHz CF contours). The great majority of cluster I/O functions were monotonic or near-monotonic, and it therefore follows that a stimulus at any given SPL would activate all those clusters with threshold at or below that SPL. The cumulative areas across the threshold range were therefore derived from the iso-threshold plots for four trained and three untrained cats, and these data are presented in Figure 12. The cumulative area functions for the untrained cats are tightly clustered, whereas the functions for the trained cats are more scattered. One reason for such scatter, of course, is absolute differences in the peripheral sensitivity of the cats: although all CAP audiograms were within the normal range, absolute CAP thresholds at 8 kHz varied by as much as 10 dB. From 50 to 70 dB, however, the cumulative areas for the trained and untrained cats are very similar, and two-tailed t tests showed no significant difference between the trained and untrained animals at these three levels (50 dB: t = 0.42; 60dB: t = 0.38; 70 dB: t = 0.27; P > 0.05 in each case). Furthermore, the slope of the regression function fitted to the data for each group over the 10–50 dB threshold range was 0.088 (r2 = 0.93) for the untrained animals and 0.075 (r2 = 0.67) for the trained animals. The similarity of these slopes, the lack of any significant difference in the area responsive to 8 kHz at the training and adjacent SPLs, and the absence of differences in the CF maps establish clearly that frequency discrimination training had no effect on the area of representation of 8 kHz in AI at any SPL.
Improvement in frequency discrimination as a result of training was reported by Recanzone et al. (1993) to be associated with changes in the breadth of tuning and minimum latency of clusters in AI. Q20 measures of breadth of tuning (CF/bandwidth at 20 dB above CF threshold) were derived from the FI response matrices to determine whether this parameter differed for AI clusters in trained and untrained cats. The clusters used in this analysis were limited to those within the area bound by the 6.25–11.75 kHz CF contours and the 3 mm rostrocaudal band used in the CF area analysis. Figure 13A shows the mean Q20 values over the frequency range from 6.5–11.5 kHz for all untrained and all trained animals that showed perceptual learning. These data were binned into 1 kHz bands to more nearly equalize the sample sizes in the frequency bands. In the untrained animals there is a general trend for the Q20 values to increase with increasing CF, as reported by others (e.g. Calford et al., 1983). By contrast, the Q20 values for the trained animals at 8–10 kHz are smaller than that at 7 kHz and smaller than those in the untrained animals. The trend in these data suggests that those clusters with a CF up to 2 kHz above the S1 frequency may have broader frequency tuning as a result of the frequency discrimination training. The statistical analysis of Q20 values was limited to between-group comparisons because of the established relationship between Q20 and CF. Despite the apparent trend in the data there was no significant difference in the breadth of tuning between the trained and untrained animals in the 1 kHz frequency bands over this frequency range, although the differences approached significance in some cases (two-tailed t tests; 8 kHz: P = 0.10; 9 kHz: P = 0.06; 10 kHz: P = 0.08). These results suggest that there might be real differences in Q values that we did not have the power to detect.
The data on CF first spike response latency at 20 dB above threshold (CF-L20) for AI clusters are presented in Figure 13B, and show little difference between the untrained and trained groups. An ANOVA on these data revealed no significant main effect for either frequency (F = 0.99; df = 4; P > 0.05) or group (F = 0.07; df = 1; P > 0.05), but there was a significant frequency × group interaction (F = 2.99; df = 4; P < 0.05). Investigation of this interaction revealed that the mean CF-L20 latency for the trained animals in the frequency band centred at 10 kHz was significantly shorter than that for the untrained group (two-tailed t test; t = 3.56; P < 0.01), but that none of the other differences were significant.
It could be argued that perceptual learning resulting from frequency discrimination training might alter cortical cluster response latency at the discrimination frequency rather than at the cluster’s CF. To investigate this possibility the first spike latency of the 8.0 kHz response at 20 dB above the 8.0 kHz threshold (8.0 kHz-L20; derived from the 8.0 kHz I/O functions) was compared across the untrained and trained groups. An ANOVA on these data (Fig. 13C) showed no significant main effect for either frequency (F = 0.84; df = 4; P > 0.05) or group (F = 2.92; df = 1; P > 0.05), but a significant frequency × group interaction (F = 2.46; df = 4; P < 0.05). The mean 8.0 kHz-L20 of the trained group is significantly shorter than that of the untrained group for the 8- and 10 kHz bands (two-tailed t test; t = 2.51; P < 0.02 and t = 2.98; P < 0.01 respectively). The differences in the other frequency bins were not significant. These data, together with the CF-L20 data suggest that response latency of AI clusters in some frequency bands has become shorter as a result of frequency discrimination training.
For five of the six cats trained on an 8 kHz frequency discrimination task, frequency discrimination thresholds improved with training and reached asymptotic levels. A number of lines of evidence indicate that a substantial component of this improvement reflected perceptual learning (i.e. an improvement in sensory discriminative capacity, as reflected in increased d′ scores) rather than procedural learning or criterion shifts. The frequency organization of AI in these trained cats did not differ from that in untrained control cats, regardless of whether area of representation was defined in terms of CF, BF or of the area over which neurons were responsive to 8 kHz at the SPL used in training. This result is in marked contrast to that of Recanzone et al. (1993), who reported 6.7- to 9.3-fold increases in the area of representation of the training frequencies in owl monkeys trained on a similar frequency discrimination task (values derived from comparisons of areas for individual monkeys with mean untrained areas, presented in their fig. 12). Although we found no evidence of changes in cortical topography as a correlate of perceptual learning on a frequency discrimination task, we did obtain evidence suggestive of small changes in neuronal response characteristics. Neurons with a CF within 2 kHz above the discrimination frequency in the trained cats had slightly broader tuning than neurons in the equivalent region in control cats, although these differences did not achieve statistical significance, and neurons in one of these bands had significantly shorter latency at both CF and 8 kHz. The use of quantitative response areas derived from FI matrices to determine neuronal response parameters, and of quantitative and objective contour fitting procedures to derive the frequency maps and areal measures, gives confidence in the validity of these results. In the following two sections the behavioural and neurophysiological data, respectively, will be discussed and compared with those from previous studies of auditory perceptual learning. In a final section the results will be compared with those from recent studies of the neural correlates of visual perceptual learning.
The asymptotic performance levels achieved by the cats in this study are in good agreement with those reported by other investigators using similar appetitive training procedures (Hienz et al., 1993). As can be seen in Figure 14, our ΔF value at 3 kHz (based on the three cats trained originally at that frequency) is almost identical to that reported by Hienz et al. (1993), and our 8 kHz value (based on the five cats that showed perceptual learning at that frequency) lies close to the extension of their curve. The much lower values for cats reported by Elliott et al. (1960) (see Fig. 14) in an early study using shock avoidance conditioning almost certainly reflect the cats’ use of a very lax criterion in an aversive paradigm in which false alarms were apparently not penalised. The ΔF values achieved by our cats at 8 kHz (510–1850 Hz) are much larger than those achieved by Recanzone et al.’s monkeys at that frequency (120–328 Hz). However, just as our values are in accord with those of Hienz et al. (1993), Recanzone et al.’s values are in accord with those of other studies of frequency discrimination in non-human primates (Sinnott et al., 1985, 1987; Prosen et al., 1990). As discussed by Hienz et al. (1993), there is clearly a species difference between cats and monkeys in frequency discrimination ability.
The data from Recanzone et al. (1993) on the generalization of perceptual learning differ from our results and from human psychophysical data bearing on the same issues. They reported that when monkeys were tested at a frequency outside the normal training range in widely spaced individual sessions, there was no improvement in performance at that frequency, i.e. that perceptual learning did not generalize to other frequencies. This was not the case in our study, in which improvements in performance on widely spaced tests at 3 kHz occurred both in cats previously trained at this frequency (Fig. 2E,F) and in cats with no previous training (Fig. 2A–C). Generalization was also reflected in the fact that smaller improvements in performance (Figs 2 and 3) and in d′ scores (Fig. 4) were exhibited at 8 kHz by those cats previously trained at 3 kHz than by those cats with more limited previous training. In a recent study of the specificity of perceptual learning in a frequency discrimination task in humans, Irvine et al. (2000) reported a similar high degree of generalization when participants trained at one frequency (5 or 8 kHz) were tested at the other (8 and 5 kHz, respectively). This generalization occurred despite the fact that all participants had been given considerable practice on the task prior to training, so the improvement at the untrained frequency could not be attributed to task-related learning.
Recanzone et al. (1993) also reported that in one monkey tested with –ΔF in widely spaced sessions during training with +ΔF, performance on the –ΔF discrimination became worse in the course of training, and presented similar but less complete data in another animal tested with –ΔF at the end of training. In contrast, in the three cats we tested with –ΔF at various times during training, –ΔF discrimination improved in parallel with +ΔF discrimination, and –ΔF thresholds were consistently better than those with +ΔF (Fig. 2A–C), as has also been reported for humans (Sinnott et al., 1987). These data are in agreement with the findings in an experiment on human frequency discrimination, in which –ΔF thresholds were obtained before and after extensive training with +ΔF and showed similar improvements (Irvine et al., 2004).
It is unclear whether these different patterns of results with respect to generalization are attributable to species differences or to differences in procedures, although it seems unlikely that generalization to other frequencies and from +ΔF to –ΔF discrimination should occur in cats and humans but not in owl monkeys.
The fact that we found no evidence of the enlarged representation of training frequencies reported by Recanzone et al. (1993) might be a consequence of the use of different species and/or of different stimulus configurations and training paradigms. As noted in the preceding section, one difference between the two species is in the level of frequency discrimination performance of which they are capable. Notwithstanding this difference, the critical fact is that the cats in our study exhibited perceptual learning and achieved asymptotic performance levels, without any change in auditory cortical topography. Another possible difference between cats and monkeys is the extent to which frequency discrimination depends on auditory cortex in the two species. Early evidence concerning the effects of auditory cortical lesions on frequency discrimination was equivocal in both species, the different results reflecting differences in tasks and in the nature of the stimulus sequences employed (for a review see Elliott and Trahiotis, 1972). It should also be emphasised that all of the lesion data are derived from experiments involving large lesions of most or all of auditory cortex, and none of them bear directly on the role of AI per se. The most recent evidence for monkeys (Harrington et al., 2001) indicates that such lesions result in a small increase in frequency discrimination thresholds, and thus that normal frequency discrimination depends at least partly on auditory cortex. In the case of cats, the weight of evidence suggests that auditory cortex is probably not necessary for frequency discrimination (Elliott and Trahiotis, 1972; Heffner and Heffner, 1998), although it is noteworthy that in the only study that involved a discrimination similar to that used in our experiment (namely detection of S2 after a train of S1 stimuli), cats with large lesions of auditory cortex were unable to relearn the task (Meyer and Woolsey, 1952). Even if auditory cortex were found not to be necessary for frequency discrimination in cats, if perceptual learning on the task involved an enlarged representation of the training frequencies at lower levels of the lemniscal auditory pathway, this change in topography would be reflected in AI and would have been detected in our experiments. Our data therefore indicate that such learning in cats does not result in such enlarged representations either at AI or at any level of the lemniscal pathway prior to AI.
The possible effects of stimulus factors are indicated not only in the review of the early lesion results by Elliott and Trahiotis (1972) but also by the report of Kilgard et al. (2001) that the form of plasticity in AI neural response properties produced by pairing of stimuli with basal forebrain stimulation is differentially dependent on stimulus parameters. In our study, cats were presented with a train of 2–8 tone pulses and were required to detect a change from S1 (8 kHz) to S2 (8 kHz +ΔF). In the experiment of Recanzone et al. (1993), monkeys were presented with a train of tone-pulse pairs, and were required to detect a change from S1 (both pulses the same frequency; e.g. 8 kHz) to S2 (the first pulse at that frequency and the second at that frequency +ΔF). If the difference in the two studies reflects this difference in stimulus parameters, it would indicate that the neural changes observed by Recanzone et al. (1993) were the consequence of a particular stimulus configuration rather than of improvements in frequency discrimination capacity per se.
Our cortical data differ in two further respects from those reported by Recanzone et al. (1993). As noted above, they reported that in the mapping data for the monkey tested most thoroughly on a –ΔF discrimination there were no locations in AI at which the CF fell in the range of the –ΔF stimuli. In contrast, there was no suggestion that the area of representation of frequencies below 8 kHz in our trained cats was smaller than that in the untrained cats, and the mean areas for the frequency range 6.25–7.75 kHz did not differ between the two groups. Recanzone et al. (1993) also reported that multi-neuron clusters in the area of enlarged representation of the training frequencies had sharper frequency tuning (i.e. higher Q10 values) and longer response latencies relative to those of clusters in the areas in which the same frequency ranges were represented in control animals. In contrast to these results we observed a (non-significant) tendency for broader frequency tuning for neurons in two 1 kHz frequency bands immediately above 8 kHz, and significantly shorter response latencies in at least one of these frequency bands.
Our finding that improved frequency discrimination in the trained animals was associated with a tendency for broader frequency tuning (lower Q20 values) in clusters with CF in the 2 kHz band above the training frequency at first sight appears counter-intuitive. If frequency discrimination depended on the activation of different populations of neurons in AI or a subcortical nucleus providing input to AI (i.e. on a simple place coding mechanism), it would be expected that improvement in discrimination would be associated with sharper tuning (resulting in less overlap of the activated populations). However, it is possible that frequency discrimination depends on differences in the distributed pattern of activity in overlapping populations of neurons (for a discussion of the various aspects of overlapping excitation patterns that might contribute to discrimination, see McKay et al., 1999). If this were the case, it might be that an increase in the breadth of tuning of neurons with CF above the discrimination frequency could contribute to differentiation of the patterns of activity evoked by the test and comparison stimuli. It is well established that frequency discrimination thresholds in humans decrease with increasing SPL (e.g. Wier et al., 1977; Nelson et al., 1983; Wakefield and Nelson, 1985; Freyman and Nelson, 1991). At low frequencies, this might reflect better phase-locking at higher SPLs, but at higher frequencies the improvement would be associated with broadening of the frequency selectivity of most cortical neurons at higher SPLs, and thus greater overlap of excited populations.
Comparison with Studies of Visual Perceptual Learning
Our finding that perceptual learning on an auditory frequency discrimination task was associated with no change in AI topography and subtle or no changes in the response properties of neurons in restricted regions of AI is in accordance with the results of a number of recent studies of changes in V1 associated with perceptual learning on various visual discrimination tasks. Schoups et al. (2001) and Ghose et al. (2002) trained monkeys on orientation discrimination tasks. In neither case was there any change in retinotopy in V1 (or in V2 in the Ghose et al. study) or in the proportion of units tuned to the training orientation (i.e. in the cortical orientation map). Schoups et al. (2001) reported that neurons tuned to the training orientation exhibited lower discharge rates than neurons tuned to other orientations, and that there was a significant change in the slope of the orientation tuning curves at the training orientation of neurons with preferred orientations approximately 20° from the training orientation. In the context of our finding of a tendency towards broader frequency tuning, it is of interest that Schoups (2002) has also described the change in slope of orientation tuning curves as an increase in breadth of tuning. Ghose et al. (2002) reported no effect of training on the RF properties of neurons in either V1 or V2, but found a small but statistically significant decrease in the population response in V1 to the trained orientation at the trained location, which reflected a slight decrease in the number of neurons responding best to the trained orientation. In a third study, Crist et al. (2001) trained monkeys on a three-line bisection task. They found no change in the retinotopic organization of V1 or in the RF and orientation tuning properties of neurons in the region of V1 activated by the training stimuli. The only effect of training was a significant change in the extent to which the responses of neurons in this region to stimuli in their RF were modified by contextual stimuli. This effect was observed only for contextual stimuli that were present in the training task, and only when monkeys were performing the bisection task.
In Figure 15, our data are plotted on a modified version of figure 15 of Ghose et al. (2002) to illustrate the relationship between their data on cortical topography and RF changes and those reported by Recanzone et al. (1992, 1993) in their somatosensory and auditory system studies. Like Ghose et al.’s data, our data point lies close to the intersection of the axes corresponding to no change in either topography or RFs, and as noted above, the data points for Schoups et al. (2001) and Crist et al. (2001) would also be clustered around this point. Whatever the reasons for the discrepancy between our results and those of Recanzone et al. (1993), our data are in agreement with these recent studies of visual discrimination learning, in indicating that perceptual learning can be associated with only small changes in the response properties of neurons in primary sensory cortex and with no change in primary cortical topography.
It remains unclear whether the substrate of perceptual learning in our and these visual studies is to be found in the small changes observed in primary sensory cortex, in other so-far unexamined characteristics of primary cortex, or in changes at other loci. Crist et al. (2001) suggested that the difference between their results and those of Recanzone et al. (1992, 1993) might be that their training involved cortical neuronal properties (RF size and orientation tuning) that are emergent properties of cortical neurons, whereas Recanzone et al.’s auditory and somatosensory studies involved properties of the input to the cortex, and might therefore not depend on intrinsic cortical circuits. Our failure to find evidence for changes in frequency selectivity or topography in cats trained on a frequency discrimination task argues against this possible explanation. In any case, although cortical tonotopy reflects that set up in the cochlea, the frequency selectivity of AI neurons is not solely determined by that of the inputs to cortex, but is indeed shaped by local circuitry (e.g. Wang et al., 2002).
This research was supported by a grant from the National Health and Medical Research Council of Australia (Project grant 980847). We are grateful to Richard Hobbs for technical assistance, to Elena Hartley, Lorraine Park and Karen Park for their contribution to animal training, and to Marc Kamke, Brad May and Ramesh Rajan for comments on an earlier version of the manuscript.