Auditory cortical columns have been studied for decades, but intracolumnar processing in auditory cortex is still poorly understood, relative to what is known about such processsing in visual cortex and somatosensory cortex. While there are certainly striking similarities in cortical structure across the modalities, investigations of auditory cortex anatomy and synaptic physiology have also found important differences from the columnar organization of other sensory cortices. In vitro and in vivo studies of thalamocortical transformations in the auditory system have begun to reveal the functional significance of these differences, and have defined the earliest stages of auditory cortical processing. However, the question of what transformations are performed within auditory cortical columns remains unresolved. Attempts to find laminar differences in auditory cortex, which could provide the key to understanding columnar transformations, have so far produced contradictory and inconclusive results. Direct analogies to primary visual and somatic sensory cortices would suggest that response properties such as bandwidth, inhibitory sideband structure, preferred modulation rate and modulation phase sensitivity might vary across layers in auditory cortex. While such analogies could prove useful as guidelines for future research, the best hope for understanding auditory columnar transformations may lie instead with a more modality-specific, functional approach.
Columnar structure in primary auditory cortex (AI) was described as early as 1929, when von Economo reported short chains of neurons radial to the cortical surface, now known as cortical microcolumns (Jones, 2002). Physiological evidence for columnar organization followed, as investigators found that neurons along a radial electrode penetration often shared similar sensitivities to sound frequency (Oonishi and Katsuki, 1965; Suga, 1965; Abeles and Goldstein, 1970; Merzenich et al., 1975). The relative influence of the two ears on auditory cortex neurons, and other response properties thought to be related to sound localization, also seem to exhibit consistent radial organization (Brugge and Merzenich, 1973; Imig and Adrián, 1977; Middlebrooks et al., 1980; Clarey et al., 1994), though some work contradicts these findings (Phillips and Irvine, 1983; Reser et al., 2000). Other neuronal response properties that have been reported to be relatively uniform within the cortical depth include intensity threshold (Suga, 1965), intensity tuning (Clarey et al., 1994) and frequency sweep tuning (Mendelson et al., 1993). These response measures, along with others such as response latency, spectral integration bandwidth and complexity of frequency tuning curves, vary systematically across the cortical surface (Schreiner and Mendelson, 1990; Sutter and Schreiner, 1991; Mendelson et al., 1997). The relationships between these several overlapping stimulus feature maps are not yet entirely understood, but the basic functional organization of AI is now becoming clear: preferred frequency changes gradually across the surface of AI to create a fundamental cochleotopic axis, and other response properties form interleaved subregions along the orthogonal, isofrequency axis (Fig. 1). Such an organization is reminiscent of the overlaid orientation, ocular dominance and spatial frequency subdomains within the visuotopic map of primary visual cortex (Hübener et al., 1997), and of the local clustering of adaptation-specific subregions for local processing in primary somatosensory cortex (Sretavan and Dykes, 1983; Sur et al., 1984).
Still unresolved, however, is the question of what neuronal response properties might vary systematically within an auditory cortical column — and therefore what stimulus features might be the substrates for columnar computation. Layer-dependent variations in minimum response latencies have been described in AI (Phillips and Irvine, 1981; Mendelson et al., 1997; Sugimoto et al., 1997), but the evidence for depth-dependent variation in any other response property is not clear. In visual cortex and somatosensory cortex, laminar differences such as the layer-dependent distributions of simple and complex cells in cat visual cortex and the unique lamina-specific anatomical structures in rodent barrel cortex have inspired concrete hypotheses about intracolumnar transformations in those systems (Hubel and Wiesel, 1962; Brumberg et al., 1999). The lack of consensus about laminar differences in auditory cortex seems to have constrained efforts to decipher the function of the cortical column in auditory processing, but recent developments in anatomical and physiological studies of auditory cortex have renewed interest in the subject. In this brief review, we focus on findings (primarily from cat and rodent cortex) that are most relevant to understanding columnar transformations in AI, and we discuss the lessons implicit in related studies of visual and somatosensory cortices. Throughout, we use the term ‘columnar’ to refer to a unit of local cortical processing encompassing interactions between cortical layers, a level of organization that is larger than anatomically defined neocortical microcolumns but perhaps smaller than physiologically defined functional modules.
The primary auditory cortex shares with other sensory cortices the basic characteristics of koniocortex: a prominent layer I, dense and well-developed layers II and III, a somewhat granular layer IV with strong thalamic input, a relatively cell-sparse layer V populated by large pyramidal neurons, and a layer VI with smaller cell bodies (Winer, 1992). The dominant connections are also consistent with visual and somatosensory cortices: lemniscal thalamic input ends in the middle layers, major corticocortical connections arise from layers II and III, subcortical projections originate primarily from layer V, and layer VI sends feedback projections to the thalamus (Mitani and Shimokouchi, 1985; Mitani et al., 1985; Huang and Winer, 2000). As in other sensory cortices, there are also corticocortical and non-lemniscal thalamic inputs to supra- and infragranular layers, as well as corticocortical and corticothalamic outputs from layers other than II/III and VI (Fig. 2) (Winer, 1992; Huang and Winer, 2000). Extensive interconnections between and within the cortical layers (Fig. 2) (Matsubara and Phillips, 1988; Ojima et al., 1991; Wallace et al., 1991) support a primary flow of information from middle layers to supragranular and then to infragranular layers. In addition, pyramidal cells in superficial layers extend their axons laterally in a patchy distribution much like the long-range intrinsic connections in visual cortex (Gilbert and Wiesel, 1979; Rockland and Lund, 1983) and somatosensory cortex (DeFelipe et al., 1986; Schwark and Jones, 1989). These horizontal projections in AI are aligned along isofrequency contours, and they link columns of neurons with similar functional properties, such as spectral integration bandwidth (Read et al., 2001).
The anatomical parallels between auditory and other sensory cortices have led to the hypothesis that there are fundamental principles of neocortical structure and connectivity common to all sensory (and other) cortex (Rockel et al., 1980). However, there are several unusual and possibly unique features of auditory cortex anatomy that complicate attempts to define common rules for sensory neocortical organization. Most obviously, the auditory cortex receives binaural input from subcortical nuclei, while in the visual and somatosensory systems, the primary sensory cortex represents the earliest neural station for convergence of inputs from the two visual hemifields or two sides of the body. Furthermore, layer III neurons in primary visual and somatosensory cortices project predominantly to ipsilateral cortex [except in regions corresponding to midline representations (Innocenti, 1980, 1986; Manzoni et al., 1980)], while many primary auditory cortex layer III neurons project across the corpus callosum (Imig and Brugge, 1978; Winer, 1992). These anatomical distinctions between auditory cortex and the visual and somatosensory cortices probably have their origins at the receptor level. Spatial information in vision and somatosensation is inherent in the arrangement of the peripheral receptors and is preserved throughout early sensory processing, while auditory spatial information must be computed from cues extracted from the differential time-frequency representations of acoustic signals received by the ears [for a review, see (Clarey et al., 1992)].
Other features of auditory cortical circuitry also seem to differ substantially from the anatomy of visual and somatosensory cortices. For example, spiny stellate cells, which dominate layer IV of the visual and somatosensory cortices in most species (Jones, 1975; Lund et al., 1979; Simons and Woolsey, 1984) are largely absent from the middle layers of cat primary auditory cortex (Smith and Populin, 2001). In their place, small pyramidal cells in lower layer III and layer IV appear to be the chief thalamorecipient neuron in auditory cortex (Smith and Populin, 2001). The broader than expected laminar distribution of lemniscal thalamic input to auditory cortex supports this hypothesis. In contrast to visual cortex and barrel cortex, in which the primary thalamic input terminates mainly in layer IV (LeVay and Gilbert, 1976; Landry and Deschênes, 1981), the lemniscal thalamic input to auditory cortex extends well into layer III (Winer, 1992; Huang and Winer, 2000). Another unusual feature of the auditory thalamocortical projection arises outside the lemniscal pathway: giant axons ascending from a non-lemniscal part of the auditory thalamus to layer I of auditory cortex appear to be unique to the auditory system, and may carry some of the earliest thalamic signals into auditory cortex (Huang and Winer, 2000).
Like the anatomy, the intrinsic properties and synaptic physiology of auditory cortex resemble those of other primary sensory cortices, with some intriguing differences. In vitro studies of auditory cortex (Metherate and Aramakis, 1999; Hefti and Smith, 2000, 2002) have identified classes of regular-spiking, fast-spiking and intrinsic-bursting cells seen in other cortical areas (McCormick et al., 1985; Connors and Gutnick, 1990). However, such studies have also found that inhibitory response kinetics are much faster in auditory cortex (Hefti and Smith, 2002), and that auditory cortex may have a unique class of neurons that spikes very briefly upon depolarization and then shows strong outward rectification suppressing further spiking (Metherate and Aramakis, 1999). Furthermore, a recent investigation of synaptic transmission found that layer II/III pyramidal neurons in auditory cortex were connected by synapses displaying low release probability and minimal short-term depression, as well as by high-probability depressing synapses (Atzori et al., 2001); only the latter type of synaptic transmission was observed in barrel cortex.
Since in vitro slice experiments are typically conducted in immature animals, these apparent physiological differences between auditory cortex and other sensory cortices might be an artifact of different maturational rates for each modality (Stern et al., 2001; Zhang et al., 2001; Desai et al., 2002). However, it is also possible that the unusual electrophysiological characteristics of auditory cortex neurons reflect unique features of auditory cortical processing. For example, ultra-rapid inhibition and a wide diversity of synaptic transmission characteristics might contribute to specialization of auditory cortex for fast temporal information processing (Buonomano, 2000). The recent development of an auditory thalamocortical slice preparation (Cruikshank et al., 2002) promises new insights into the nature of auditory cortical physiology, and further modality comparisons through parallel experiments on auditory and somatosensory thalamocortical slices (Agmon and Connors, 1991).
Thalamocortical and Intracortical Transformations
Thalamocortical transformations in the auditory system have recently been characterized in some detail through simultaneous in vivo recordings of functionally connected neurons in cat auditory thalamus and cortex (Miller et al., 2001). These experiments have revealed many forms of thalamocortical transformation, distributed between three extremes. In ‘inheritance’, cortical and thalamic excitatory receptive fields are matched in spectrotemporal extent; in ‘constructive convergence’, the thalamic receptive field is a component of a larger cortical receptive field; and in ‘ensemble convergence’, the cortical receptive field represents a subregion of the thalamic receptive field. Similar studies in the visual system find predominantly ‘constructive convergence’, in that the receptive fields of neurons in the visual thalamus usually cover small subregions of the receptive fields of functionally connected visual cortex neurons (Reid and Alonso, 1995; Alonso et al., 2001). In the somatosensory whisker barrel system, on the other hand, ‘ensemble convergence’ may dominate thalamocortical transformations, since the excitatory portions of thalamic receptive fields tend to include more whiskers than their regular-spiking cortical counterparts (Simons and Carvell, 1989) [although thalamocortical transformations involving suspected inhibitory cortical interneurons may exhibit ‘constructive convergence’ (Swadlow and Gusev, 2002)].
The various auditory thalamocortical transformations demonstrated by Miller and colleagues (Miller et al., 2001) involved primarily the excitatory portions of thalamic and cortical receptive fields. Inhibitory subregions of paired thalamic and cortical receptive fields appear to be less closely related, and many receptive-field properties that depend on inhibitory subfield arrangements (e.g. temporal and spectral modulation preferences) are poorly conserved in auditory thalamocortical transformations (Miller et al., 2001). Perhaps the inhibitory subregions of cortical receptive fields (and associated neuronal response properties) are generated intracortically, through disynaptic interactions involving thalamic input onto inhibitory interneurons that synapse onto pyramidal cells within the same cortical layer. Such intracortical inhibition may shape cortical responses in thalamorecipient layers of auditory cortex, much as it is thought to do so in layer IV of visual cortex (Somers et al., 1995; Hirsch et al., 1998; Troyer et al., 1998) and barrel cortex (Brumberg et al., 1996; Pinto et al., 2000; Swadlow and Gusev, 2000).
How are receptive fields in thalamorecipient layers of auditory cortex transformed by further intracortical columnar processing? As mentioned in the introduction, previous studies of laminar differences and columnar processing in auditory cortex have failed to produce a consensus on how auditory receptive fields might differ across cortical layers. Studies of cat auditory cortex have reported layer-dependent variations in minimum response latency, with the shortest latencies in the thalamorecipient middle layers (Phillips and Irvine, 1981; Mendelson et al., 1997), but investigations of rodent auditory cortex find the shortest response latencies in deeper layers [Mongolian gerbil (Sugimoto et al., 1997); mouse (Shen et al., 1999)]. Laminar differences in frequency tuning bandwidths, intensity thresholds and other response properties have been observed in some studies of cat, bat and rodent auditory cortex (Oonishi and Katsuki, 1965; Eggermont, 1996; Dear et al., 1993; Sugimoto et al., 1997), but not in other studies of the same species (Abeles and Goldstein, 1970; Phillips and Irvine, 1981; Jen et al., 1989; Clarey et al., 1994; Foeller et al., 2001). Meanwhile, investigations in awake monkey cortex have recently reported systematic layer-dependent variations in binaural interactions (Reser et al., 2000), and have suggested that such laminar differences might have been masked in earlier experiments by the confounding effects of anesthesia.
Even if effects of anesthesia explain some of the discrepancies in the literature, the lack of a consensus regarding laminar differences in auditory cortex contrasts markedly with the situation for visual and barrel cortex. Although controversies about the nature of intracolumnar transformations in those systems are far from resolved, the existence of laminar differences in stimulus sensitivities is beyond dispute. Indeed, laminar differences in visual cortex and barrel cortex (of both anesthetized and awake animals) have inspired many hypotheses about columnar function in those modalities. For example, the layer-dependent distribution of simple and complex cells in cat visual cortex, with simple cells predominating in the input layers and complex or hypercomplex cells more prevalent in superficial or deep layers, prompted Hubel and Wiesel to propose that complex cell receptive fields emerge through convergence of simple cell inputs within a column (Hubel and Wiesel, 1962). Their hypothesis has received experimental support from recent studies (Alonso and Martinez, 1998; Martinez and Alonso, 2001), although the influences of nonlinear dendritic interactions and recurrent connections on complex receptive-field structure are still much debated (Mel et al., 1998; Chance et al., 1999). In barrel cortex, the anatomical and physiological differences between layer IV and superficial or deep layers have also inspired hypotheses regarding columnar computation in this system. For example, the superficial and deep layers, which contain neurons with complex multi-whisker receptive fields, may construct dynamic representations of behaviorally relevant stimuli from the more precise single-whisker representations that predominate in layer IV (Simons, 1978; Brumberg et al., 1999). A complementary hypothesis is that the different layers of barrel cortex support parallel processing of spatial and temporal tactile information (Ahissar et al., 2000, 2001),
Is auditory cortex inherently more homogeneous across cortical layers than these other sensory cortices? As discussed previously, auditory cortical circuitry does have some unique features, but the fundamental similarities with other sensory cortices seem far more striking than these differences. Studies in which auditory thalamocortical pathways are modified experimentally (‘rewired’) to receive visual signals further suggest that auditory cortex is capable of supporting the thalamocortical and intracolumnar transformations that produce laminar differences in other modalities. When retinal inputs are routed into the auditory thalamus after deafferentation of the normal brainstem inputs to the structure (Sur et al., 1988; Angelucci et al., 1998), auditory cortical cells develop visual response properties such as direction selectivity, orientation tuning and simple/complex receptive-field structure (Roe et al., 1992). Retinotopic maps of orientation tuning, complete with lateral connectivity between orientation domains, emerge in superficial layers of the rewired auditory cortex (Roe et al., 1990; Sharma et al., 2000). While laminar differences in rewired AI have not yet been systematically explored, the observed physiological parallels with VI suggest similar underlying intracolumnar transformations, and provide compelling evidence for common principles of columnar organization linking sensory cortical structures in different modalities.
If laminar organization in auditory cortex is not inherently more homogeneous than that in other sensory cortices, then what is the explanation for the lack of consensus regarding laminar differences? It is possible that auditory stimuli, recording methods or experimental conditions in previous studies have not usually been sufficient or consistent enough to reveal compelling laminar differences, and that further progress awaits the use of new and more sophisticated stimulus sets, novel experimental techniques for recording simultaneously within a column, or simply more data from more species under awake as as well as anesthetized conditions. Another possibility, however, is that cortical processing of auditory information drives functional organization in cortex that obscures laminar differences. Most previous studies of laminar differences in auditory cortex have examined sequential recordings from single-electrode penetrations, but then pooled data taken from penetrations at different sites across cortex in attempting to identify systematic laminar differences. Perhaps the substantial variability in response properties within thalamorecipient layers of auditory cortex (Fig. 3) greatly exceeds the variability in response properties across cortical layers. Definition of systematic laminar differences in auditory cortex may be possible only within small subregions of the overlapping stimulus feature maps (Fig. 1) in which variability within each cortical layer is minimized.
How might receptive-field properties in auditory cortex relate to those observed in visual cortex and somatosensory cortex? If valid analogies can be drawn at the level of the sensory receptors, then the one-dimensional frequency map in the cochlea would be the analog of two-dimensional spatial maps in the retina or on the body surface. Preferred frequency and spectral integration bandwidth in the auditory system would correspond to preferred stimulus location and receptive-field size in the visual and somatosensory systems. Sensitivity to sound intensity would be analogous to sensitivity to brightness or contrast in the visual system, and to the amplitude of skin indentation or whisker deflection in the somatosensory system. Inhibitory sidebands in frequency tuning curves (or equivalently, inhibitory subregions offset spectrally from the excitatory subregions of spectrotemporal receptive fields) might be functionally similar to the inhibitory spatial surrounds of visual or somatosensory receptive fields. Amplitude modulation rate or repetition rate sensitivity in the auditory system would correspond to flicker sensitivity in the visual system, and to tap rate or whisker vibration sensitivity in the somatosensory system. Finally, frequency-sweep tuning in auditory receptive fields might be analogous to visual or somatosensory motion sensitivity, since these forms of response tuning all relate to movement of an auditory, visual or somatosensory stimulus across the receptor surface.
To the extent that such direct receptor-level analogies to the visual and somatosensory systems are appropriate, then one would expect several stimulus features to vary in their representation across layers in auditory cortex. Visual and somatosensory receptive-field sizes tend to be smallest in layer IV (Gilbert, 1977; Simons, 1978; Sur et al., 1985; Métin et al., 1988), suggesting that the spectral integration bandwidths of auditory cortical neurons should be narrowest in the middle layers. Neurons sensitive to high speeds of visual motion are found most often in supra- and infragranular layers of visual cortex (Gilbert, 1977; Mangini and Pearlman, 1980), so perhaps a similar laminar distribution would be expected for auditory cortex neurons sensitive to fast frequency sweeps. Laminar differences in temporal frequency tuning in barrel cortex (Ahissar et al., 2001) imply that receptive fields in different layers of auditory cortex could have distinct preferences for rates of amplitude modulation. Variations with cortical depth in the strength of orientation tuning in visual cortex (Mangini and Pearlman, 1980; Martinez et al., 2002; Ringach et al., 2002) and somatosensory cortex (Simons, 1978; Chapin, 1986; DiCarlo and Johnson, 2002) might have an auditory parallel in the laminar distribution of auditory receptive fields with pronounced and/or asymmetric inhibitory sideband structure. Finally, the layer-dependent distribution of simple and complex cells in visual cortex (Hubel and Wiesel, 1962; Gilbert, 1977) suggests that a similar distribution of linear and nonlinear response types might be found in auditory cortex. Simple and complex visual neurons are distinguished by their relative sensitivity to the spatial phase of an oriented stimulus; auditory cortex analogs might show varying sensitivity to the phase of spectral ripple, or perhaps to the phase of frequency and amplitude modulations.
Ultimately, however, these receptor-level analogies may prove less useful as a guide to understanding columnar transformations in auditory cortex than a more functional, modality-specific approach inspired by another observation from studies of visual and somatosensory cortices: the apparent relevance of columnar structure to experience-dependent sensory processing. Experience-dependent plasticity in receptive-field structure follows layer-dependent time courses in both visual cortex (Daw et al., 1992; Trachtenberg et al., 2000; Desai et al., 2002) and barrel cortex (Diamond et al., 1994; Stern et al., 2001). These findings suggest that experience may shape and underlie the function of cortical columns in any cortical structure. Indeed, experiments in congenitally deafened cats have already demonstrated that auditory experience plays a crucial role in development of normal patterns of laminar activation in auditory cortex (Kral et al., 2000). Other studies of experience-dependent plasticity in auditory cortex (Recanzone et al., 1993; Weinberger, 1998; Kilgard et al., 2001; Zhang et al., 2001) have documented profound changes in frequency tuning, repetitionrate tuning and other auditory receptive-field properties after behavioral training or other manipulations of auditory experience, but such studies have largely omitted any systematic exploration of the effects on a laminar basis. Critical insights into the function of cortical columns in auditory cortex may come, not from strict receptor-level analogies to the roles of cortical columns in visual and somatosensory processing, but from a natural extension of the rich history of work on cortical plasticity to pinpoint the roles of different cortical layers in auditory learning.