While functional connectivity in the human cortex has been increasingly studied, its relationship to cortical representation of sensory features has not been documented as much. We used functional magnetic resonance imaging to demonstrate that voxel-by-voxel intrinsic functional connectivity (FC) is selective to frequency preference of voxels in the human auditory cortex. Thus, FC was significantly higher for voxels with similar frequency tuning than for voxels with dissimilar tuning functions. Frequency-selective FC, measured via the correlation of residual hemodynamic activity, was not explained by generic FC that is dependent on spatial distance over the cortex. This pattern remained even when FC was computed using residual activity taken from resting epochs. Further analysis showed that voxels in the core fields in the right hemisphere have a higher frequency selectivity in within-area FC than their counterpart in the left hemisphere, or than in the noncore-fields in the same hemisphere. Frequency-selective FC is consistent with previous findings of topographically organized FC in the human visual and motor cortices. The high degree of frequency selectivity in the right core area is in line with findings and theoretical proposals regarding the asymmetry of human auditory cortex for spectral processing.
To understand complex computation in the brain, it is necessary to identify the pattern of functional interactions between different regions at various spatial and temporal scales. This requires an understanding of the pattern of temporal coherence of neural activity between individual neurons or different populations of neurons, beyond relating only the magnitude of neural responses to behavioral and cognitive variables. Temporal coherence in neural activity has been studied for decades in terms of synchrony in spiking activity (Phillips et al. 1984), synchronous oscillations of neural populations (Singer and Gray 1995; Fries 2005), and correlations in trial-to-trial variability (or “noise” correlation) (Gawne and Richmond 1993; Lee et al. 1998; Averbeck et al. 2006). Given the invasive nature of this line of research, such fine-scale temporal dynamics in human brains has not been studied much. However, temporal coherence in activity measured functional magnetic resonance imaging (fMRI), referred to as “functional connectivity” (FC), has gained much attention and has provided significant information in the field of systems and cognitive neuroscience (Fox and Raichle 2007; Behrens and Sporns 2012). Although the fluctuations that yield coherent patterns in fMRI are rather very slow (<0.1 Hz) than such fast as those in neurophysiological studies, they seem to have a neuronal origin (Shmuel and Leopold 2008; Schölvinck et al. 2010) as do evoked fMRI responses (Logothetis et al. 2001). It is also evident that coherent fast fluctuations or oscillations in neural activity observed in neurophysiological studies are embedded in very slow fluctuations in fMRI activity (Leopold et al. 2003; Shmuel and Leopold 2008; Kohn et al. 2009; Leopold and Maier 2012). Notably, FC, or temporal coherence in neural activity, in most of these studies is considered “intrinsic” because it is obtained by correlating spontaneous activity in the absence of a stimulus or a task, and thus it is not explained by external inputs or task demands.
While the functional roles of intrinsic neural activity are not yet well understood, there has been a line of research to relate it to cortical representation of sensory stimuli, both in animals and humans. For instance, a study using voltage-sensitive dye imaging in the cat visual cortex showed that temporally coherent spontaneous activity emerges in the spatial pattern of orientation maps (Tsodyks et al. 1999; Kenet et al. 2003). Nauhaus et al. (2009) found that spontaneous spiking activity in monkey and cat primary visual cortex triggers spatiotemporal propagations of local field potentials that reflect the similarity of preferred orientation between recording sites. In the auditory domain, Fukushima et al. (2012) demonstrated that high-gamma band spontaneous activity in the macaque auditory cortex is coherent with frequency tuning at recorded sites. Analogously, correlations of residual spiking activity (noise correlation) of 2 simultaneously recorded neurons in macaque primary motor cortex have been reported to be high when their tuning properties are similar (Lee et al. 1998). In fMRI studies, it has been demonstrated that topographically organized sensory and motor features are related via very slow coherent fluctuations of intrinsic activity. Heinzle et al. (2011) found that intrinsic FC measured by fMRI in visual cortex is retinotopically organized: activity of fMRI voxels in V1 in resting-state was better explained by activity of voxels in V3 when the voxels had similar receptive field locations than dissimilar locations. Other resting-state fMRI studies have shown somatotopic organization of intrinsic FC in human motor network (van den Heuvel and Hulshoff Pol 2010; Cauda et al. 2011) and in monkey somatosensory cortex (Chen et al. 2011). These results agree with observations in the animal literature that neural populations with similar tuning share coherent intrinsic activity, although the gaps in sampling units, frequency ranges, and species have yet to be filled (Kohn et al. 2009).
An advantage of using fMRI to study the functional organization of the brain at large scale is that it samples activity in multiple brain regions simultaneously. This applies not only to conventional amplitude-based studies, but also to studies of FC. For example, in Heinzle et al. (2011), the authors additionally demonstrated that the pattern of intrinsic FC in the visual cortex reflects difference in interhemispheric connectivity depending on voxel receptive field locations: voxels whose receptive fields were located along the vertical meridian have significant FC across the hemispheres, while that is not the case for voxels that are responsive to the horizontal meridian. This pattern of FC reflects the pattern of callosal connectivity found in anatomical tracer studies (Kennedy et al. 1986). Haak et al. (2012) have shown that the spatial extent of FC in human visual cortex increases as the hierarchical order of visual areas increases. This pattern is commensurate with the hierarchical structure of retinotopic connectivity in early visual areas (Lehky and Sejnowski 1988; Angelucci et al. 2002). Their findings also support the notion of constant cortical extent of interareal projections in the early visual cortex that has been proposed in previous studies (Hubel and Wiesel 1974; Motter 2009; Kumano and Uka 2010; Harvey and Dumoulin 2011).
In the present study, we hypothesized that intrinsic activity of auditory neurons in humans is more correlated with each other when they have similar preferred frequencies than when dissimilar. This hypothesis can be tested using fMRI since fMRI is capable of sampling the activity of neurons with similar preferred frequency in a voxel, because of tonotopic organization (Merzenich and Brugge 1973; Romani et al. 1982; Morel et al. 1993; Howard et al. 1996; Wessinger et al. 2001); intrinsic FC can then be estimated on a voxel-by-voxel basis as discussed above. We predicted that the intrinsic FC between fMRI voxels, which is not explained by stimulus inputs, would be higher between voxels with similar preferred frequencies than between voxels with dissimilar ones. To test this hypothesis, we first estimated preferred frequencies of individual voxels in human auditory cortex using fMRI, and then analyzed residual fMRI activity and resting-epoch activity to compute intrinsic FC with respect to preferred frequency.
We further investigated whether the degree of frequency selectivity in FC differs across core and noncore auditory cortex, and across the hemispheres, in order to relate the pattern of FC to 2 well-known principles of functional architecture of auditory cortex: hierarchical processing and functional asymmetry. In human auditory cortex, core fields have higher frequency selectivity than secondary fields (Wessinger et al. 2001; Moerel et al. 2012), as is also the case in other animals (Morel et al. 1993; Rauschecker et al. 1995). This hierarchy is thought to be due to integration of a broader range of frequency information and higher complexity of sensory representation in the secondary fields than the core (Rauschecker 1998; Kaas et al. 1999; Wessinger et al. 2001; Kumar et al. 2007). Another important aspect in the functional organization of the human auditory cortex is that the auditory cortex in the right hemisphere has a higher spectral resolution than on the left, which instead is more sensitive to rapid temporal variations (Zatorre et al. 2002). This model is supported by findings that right auditory areas show stronger modulations to spectral variations (Schönwiesner et al. 2005; Hyde et al. 2008).
One interesting question that we address with our paradigm is whether the extent of frequency selectivity in FC is reflective of the hierarchical and asymmetric patterns of response seen in other studies. If frequency selective FC is a byproduct of the selectivity in response amplitude to stimuli, or conversely, sharp frequency tuning in the core fields is explained by local FC within an area, then we would expect that FC would be more frequency-selective (1) in the core fields than in the noncore fields, and (2) in the right than in the left auditory cortices. However, if FC is not simply reflective of the pattern of response amplitude but rather provides additional information for hierarchy and asymmetry, the pattern of FC would differ from the pattern of frequency selectivity in response amplitude. We compare the degrees of frequency selectivity in FC across the core and noncore fields of both hemispheres and discuss the results in light of sensory encoding/decoding and hierarchical emergence of functional asymmetry.
Materials and Methods
Seven people (4 male, 25–32 years old) with normal hearing went through anatomical and functional MRI scans with informed consent after approval of the experimental procedure by the local ethics committee.
Eight different frequencies of pure tones were used to stimulate the pure-tone sensitive and tonotopic auditory areas of the participants. The frequencies were logarithmically spaced between 200 and 8000 Hz (200, 338.8, 573.8, 971.9, 1646.2, 2788.4, 4723.1, and 8000 Hz). In order to minimize adaptation, frequency was slightly jittered within a range of a single semitone (1/12 octave) every 250 ms with 3/4 duty cycle during each 4-s long stimulus presentation of one of the 8 frequencies. The System 3 hardware of Tucker Davis Technologies (Alachua, FL, USA) was used to generate the stimulus at 24.4 kHz sampling rate. In stimulus presentation, we added a noise that has equal energy in each equivalent rectangular band (Moore and Glasberg 1996) at a level of 40 dB below the tones in order to minimize the effect of different thresholds for different frequencies, interparticipant difference in hearing threshold, and the transfer function of the headphones. Subjects were able to adjust the loudness of the sounds at 70–80 dB SPL that were delivered binaurally via an MR-compatible high-fidelity headphone (MR Confon).
A sparse imaging protocol (Belin et al. 1999; Hall et al. 1999) with 9-s long repetition time (TR) was applied (Fig. 1). A block consisted of 4 epochs (corresponding to 4 TR's), each of which lasted for 9 s. During the initial 2 epochs of a block, a pure tone in one of the 8 frequency conditions was presented. Each epoch started with 4 s of stimulus presentation, followed by 1 s of image acquisition, and then 4 s of silence. Thus, the noise due to the functional image acquisition did not interfere with hearing the tone stimulus. The 2 epochs with pure tone sound presentation were followed by silence that lasted for the remaining 2 epochs. A long duration (18 s) of silence was inserted in order to minimize the fMRI response undershoot effect between stimulus blocks (Hu et al. 2010; Olulade et al. 2011). Each frequency condition was presented 10 times for each run of functional imaging and each subject underwent 2 runs. The order of stimuli was pseudorandomized with balanced transition probability. The subjects were instructed to passively listen to the stimuli while watching a silent nature documentary.
An echo-planar imaging sequence (gradient echo; repetition time: 9 s; echo time: 36 ms; flip angle: 90°; in-plane resolution: 1.5 × 1.5 mm2; slice thickness: 2.5 mm; field of view: 192 mm) was used to acquire functional images on a 3 T scanner (Trio, Siemens). The total number of volumes per subject was 322 including 1 initial dummy volume. Thirteen slices were oriented parallel to the lateral sulcus to cover Heschl's gyrus, planum temporale, planum polare, and the superior temporal gyrus and sulcus. A high-resolution (1 × 1 × 1 mm3) MPRAGE image that covered the whole brain was acquired for each subject in the same session for anatomical registration.
Data Preprocessing and the Estimation of Preferred Frequency of Voxels
Functional imaging data were preprocessed and the general linear model (GLM)-based estimation of response to the tone stimuli was conducted using SPM2 (www.fil.ion.ucl.ac.uk/spm, last accessed August 26, 2014). Preprocessing included motion-correction and high-pass filtering. No additional spatial smoothing or stereotaxic normalization was applied to minimize spatial autocorrelation. The gray matter (GM), the white matter (WM) and the cerebrospinal fluid were segmented using the segmentation tool of SPM2 (Ashburner and Friston 1997). The segmented tissues were used to define the voxels/regions of interest and estimate correlations as a function of intervoxel distance (see below). High-resolution anatomical images were aligned with the functional images to be displayed in Figure 2.
Eight frequency conditions were taken into account in the GLM analysis. A boxcar model was used to account for hemodynamic responses in a long-TR sparse acquisition. After estimating response amplitude as regression coefficients, F-contrast was applied to detect pure-tone-responsive voxels (F-test, P < 0.05, uncorrected). A rounded exponential function, which is a Gaussian-like bell-shaped function, was fit as a frequency tuning function (Rosen and Baker 1994) to the response amplitudes of each voxel to estimate their preferred frequency. The form of the fitting function was y = a(1 + k|x − m|)e|x−m|, where y is the response amplitude for 8 frequencies, x is the vector of log-transformed frequencies and the preferred frequency was parameterized by m. Accordingly, the preferred frequency of each voxel was selected at the peak of the function. Note that the 8 linear-scale frequencies (200, 338.8, 573.8, 971.9, 1646.2, 2788.4, 4723.1, and 8000 Hz) were corresponded to integers from 1 to 8 in the log scale and the preferred frequency estimate was bounded at 0.5 and 8.5 in the log scale (correspondingly at 153.7 and 10 412 Hz in the linear scale of frequency). The preferred frequency of the voxels that had nonsignificant level of goodness-of-fit (F-test, P > 0.05; 17.12% [SD = 0.01] of voxels) was replaced with the measure of the center of mass of the amplitudes: the amplitude was averaged with the log-transformed frequency values weighted.
Selection of Voxels and the Definition of the Core-fields and the Noncore-fields Areas
Upon choosing the pure-tone-responsive voxels and estimating their preferred frequency, we defined the core-fields and the noncore-fields areas. The border of core-fields area was identified based on tonotopic gradient and multivariate pattern classification under the assumption that the core fields are more sensitive to pure-tone stimuli than the noncore, as determined in a previous study (Schönwiesner et al. 2015 for further details). This method utilizes the support vector machine technique to find the boundaries in the imaging data to classify the frequencies. The 8-class frequency classification problem was solved by partitioning into pair-wise binary classifications. The core-fields area was defined as those voxels in which the classifier predicts the frequencies with statistically significant accuracy. The classification accuracy was significantly correlated with the response magnitude to pure tones and frequency tuning width, which can serve as indicators of response properties of the core-fields area according to previous studies (Morel et al. 1993; Rauschecker et al. 1995; Wessinger et al. 2001; Petkov et al. 2006; Moerel et al. 2012). The delineation of the core area by this method was generally in agreement with the results of the probabilistic cytoarchitectonic maps (Morosan et al. 2001; Rademacher et al. 2001). Also, this same method identified the core-fields in macaque monkeys that overlap with previous parcellations of AC in the same data (Petkov et al. 2006). Only the voxels that had P-values <0.05 and reside in the segmented gray matter were included in the present study. The gray matter, the white matter (WM) and the cerebrospinal fluid were segmented using the segmentation tool of SPM2 (Ashburner and Friston 1997).
The voxels of noncore areas were identified by searching for significantly activated voxels in the gray matter starting from the edge of the core areas: the search algorithm incorporated the surrounding voxels in the 3-dimensional space until it could not find any more significantly activated voxels within the gray matter. This procedure was chosen under the assumption the belt and the parabelt areas surround the core fields, based on the known AC organization in primates (Kaas et al. 1999; Kaas and Hackett 2000; Hackett 2011). Figure 2A shows a representative subject's voxels of interest marked as core and noncore areas in 3 selected slices.
Residual Functional Connectivity and its Frequency Selectivity in the Auditory Cortex
Voxel-by-voxel FC was computed as temporal correlation (Pearson's correlation coefficient) of the residual fMRI signal of each voxel after regressing out the model responses predicted by the stimulus (hereafter “residual” FC). For each pair of voxels, correlation coefficients obtained from 2 runs were averaged. To ensure linearity, correlation coefficients were transformed to Fisher's z-score, and then averaged and transformed back to correlation coefficients.
FC was analyzed by pairing voxels at 2 levels: a hemisphere level and an area level. At the hemisphere level, we paired voxels either from one hemisphere (L–L: within the left hemisphere; R–R: within the right hemisphere) or across the hemispheres (L–R: between the left and the right hemispheres, see Figs 3–7). At the area level, voxels were paired within an area or between areas, either within or between hemispheres. In Figure 8, “LC” represents the core area in the left hemisphere; “RC”, the right core; “LN”, the left noncore; and “RN”, the right noncore area. The label of a pair of the same area such as “LC–LC” refers to the case where the seed and the target voxels were drawn from the same area. In this way, 6 pairs of areas were defined within hemispheres (Fig. 8A) and 4 pairs of areas were defined between hemispheres (Fig. 8B).
FC as a function of preferred frequency in Figures 3–5 was obtained by a grid method with respect to preferred frequency of seed and target voxels: FC between a given voxel (seed voxel) and the remaining voxels (target voxels) was sorted by the preferred frequency of target voxels and then the resulting voxel-wise FC data of each seed voxel as function of preferred frequency of target voxels was grouped with respect to preferred frequency of seed voxels. The seed and target voxels were chosen either from the same hemisphere or from different hemispheres, depending on the analysis condition. Note that preferred frequency is expressed on a log-scale that ranged from 0.5 to 8.5 (from 153.7 to 10412 Hz; see the above description in Materials and Methods).
In order to examine patterns of FC related to selectivity but irrespective of the preferred frequency of any given voxel, we merged frequency selectivity of FC across multiple seed voxels regardless of their preferred frequencies, by sorting FC data as a function of difference of preferred frequencies on a log-scale (Δ preferred frequency) (Fig. 2C and D). The data were binned with bin edges of 0, 0.5, 1, 2, 4, and 8, in order to ensure reliable estimations of average FC even at larger Δ preferred frequencies where we have fewer data points. The binned data were averaged across subjects (Figs 3C, 5B, 6C, 7, and 8). Page's trend test was applied to test whether frequency selectivity is statistically significant. L-scores were transformed to χ2 scores to compute P-values (Page 1963).
Correction of Intervoxel Distance Bias
Possible biases due to the point-spread of fMRI signals (Engel et al. 1997) and the generic FC that extends over large distance in the cortex (Leopold et al. 2003; Bellec et al. 2006; Honey et al. 2009; Schölvinck et al. 2010) were corrected by subtracting FC due to intervoxel distance in the following way: we first correlated the residual fMRI signals between the voxels in gray matter regions, excluding auditory cortex. We then selected voxels with the lowest 5% F-values in the GLM analysis with the aim to exclude any remaining pure-tone-responsive voxels outside auditory cortex. We then computed voxel-by-voxel correlations between the time courses of these voxels and binned the results along intervoxel distance with width of 1.5 mm (the in-plane resolution of the functional data). We subtracted this FC estimate from the measured FC in auditory cortex to obtain distance-corrected FC (Fig. 6).
Resting-Epoch FC and Testing Confounding of Stimulus Effect
As an alternative measure of intrinsic FC that does not depend on the incoming stimulus, FC was also computed by correlating residual activity that was taken from the fourth TR of every block (every 36 s), which is referred to as “resting epochs” (Fair et al. 2007). The acquisition time of fourth TR is relatively far (18 s) from the preceding stimulation, and we would expect only minimal effects of the stimulation on the fMRI signal, because hemodynamic response functions are typically close to zero at this point (Hu et al. 2010; Olulade et al. 2011). To rule out any possibility that delayed hemodynamic responses to the stimulus still affected the signal in resting epochs, we additionally regressed out the stimulus effect from the resting-epoch time series. For this purpose, we applied a GLM that predicts stimulus-induced responses according to the frequency of the sounds given in each block, and subtracted the predicted time course from the resting-epoch time course. The frequency selectivity of resting-epoch FC with and without the stimulus-effect regressed out was tested in the same way as the residual FC (Fig. 7).
Comparison of Frequency Selectivity Between Areas
In order to compare frequency selectivity of FC from one area to another, frequency selectivity of FC was first quantified by computing the slope of a linear function fit to FC as a function of Δ preferred frequency. A higher value thus reflects a steeper slope of the function, indicating higher selectivity. We tested whether frequency selectivity of FC in one area is higher than in another by a permutation test: we resampled the frequency selectivity (the negative slope of linear fit) with replacement out of 7 subjects 1000 times for each area and permuted the area membership to obtain the sampling distribution under the null hypothesis. A P-value for the difference in frequency selectivity in the sample mean was then computed.
Preferred Frequency Selectivity of Functional Connectivity in the Human Auditory Cortex
Voxel-by-voxel FC in the human auditory cortex was found to depend on the similarity of the preferred frequencies of voxels (Figs 2 and 3). Figure 2A shows a representative subject's tonotopy maps based on the preferred frequency of the voxels of interest (Fig. 2A, top), and the maps of FC between a voxel (seed voxel; cross-hair) and the remaining (target) voxels (Fig. 2A, bottom). The FC of the 3 exemplary seed voxels is plotted as a function of preferred frequency of the target voxels in Figure 2B. This voxel-by-voxel FC is then resorted as a function of preferred frequency difference between the seed and the target voxels as shown in Figure 2C. This sorting allows us to pool FC of all pairs of voxels, irrespective of preferred frequency (Fig. 2D).
Figure 3 demonstrates that frequency selectivity of FC is present in the averaged data across participants, both within hemispheres (i.e., when seed and target voxels were chosen within a hemisphere (LL: within the left hemisphere; RR: within the right hemisphere) and across hemispheres (i.e., when seed and target voxels were chosen in different hemispheres, LR). Voxel-by-voxel FC was binned with respect to the preferred frequencies of seed and target voxels to be presented as a matrix (Fig. 3A) and as a family of curves (Fig. 3B). High FC is in general observed along the diagonal of the matrices in Figure 3A, indicating higher FC between voxels with similar preferred frequencies. This pattern is also reflected in the peaks of FC in Figure 3B when seed and target voxels have the same preferred frequency. Note that the roles of seed and target voxels can be exchanged because FC does not contain directionality information. In Figure 3C, voxel-by-voxel FC was pooled and binned as a function of difference in preferred frequency (Δ frequency) between the correlated voxels to reveal that FC decreases as Δ frequency increases. Page's trend test indicated that the gradual decrease of FC as a function of Δ frequency is statistically significant for the 3 conditions of hemisphere pairs (LL: L = 383, χ2 = 26.42, P < 0.0001; RR: L = 385, χ2 = 28.00, P < 0.0001; LR: L = 383, χ2 = 26.42, P < 0.0001). These results hold for each tested individual: FC was found to be frequency-selective in all individual participants (Fig. 4), and the correlation between voxel-by-voxel FC and Δ frequency was significant in each participant in the within/between hemisphere conditions (P < 10−6 for all cases).
Because of the tonotopic organization of auditory cortex (i.e., neurons with more similar frequency preference tend to be more closely located to one another), frequency preference of voxels would tend to be correlated with intervoxel distance. FC is also in general correlated with physical distance between paired areas or voxels (Salvador et al. 2005; Bellec et al. 2006; Honey et al. 2009) due to spatial smoothing of fMRI signals, point spread of fMRI blood oxygen level dependent (BOLD) signal (Engel et al. 1997) and/or the generic FC over the cortex whether local or global (Leopold et al. 2003; Honey et al. 2009; Schölvinck et al. 2010). Therefore, the observed frequency-selective FC could have been only a byproduct of these 2 correlations (correlation between frequency preference and intervoxel distance and that between intervoxel distance and FC). To ascertain that the correlation of FC with preferred frequency was not fully explained by the correlation of FC with distance, we corrected the FC with respect to intervoxel distance.
We first confirmed that the temporal correlation of fMRI residual signals within the auditory ROI and also outside of the ROI (in the remaining gray and white matter) are dependent on intervoxel distance (Fig. 5A). The white matter shows the sharpest decay as a function of distance. The spatial extent (∼3 mm) implies that the correlation observed in the white matter is presumably due to spatial smoothing effect in fMRI acquisition and motion correction rather than correlations in neural activity. In contrast, the gray matter has a less steep decay function, which is consistent with previous interpretations that this function reflects local and global correlation in neural activity (Leopold et al. 2003; Schölvinck et al. 2010). The correlation in nonauditory regions of gray and white matter decayed to zero for distances of <30 mm, whereas that in the auditory ROIs asymptoted at a nonzero, positive value. This indicates that local FC in the auditory cortex is higher overall than the spatial extent of global FC in the rest of the gray matter, which in turn implies that there is auditory cortex-specific FC in addition to generic correlation in on-going fMRI activity over the entire cortex.
As a second verification that distance does not explain the observed FC in auditory cortex, we predicted FC that would have been observed if the generic correlation in the gray matter had completely explained the frequency selectivity of FC in the auditory cortex. We did so by assigning the correlation values in the gray matter to the voxels in the auditory ROIs that have equivalent intervoxel distance. The FC predicted on this basis showed a weak but significant frequency selectivity within hemispheres, but not between the hemispheres, since interhemispheric distances are much >30 mm, where the correlation in the gray matter voxels decays away (Fig. 5B; LL: L = 385, χ2 = 28.00, P < 0.0001; RR: L = 385, χ2 = 28.00, P < 0.0001; LR: L = 334, χ2 = 0.50 P = 0.096) . The small but significant frequency selectivity in the predicted FC within hemispheres could bias our findings, so we corrected this potential problem by subtracting the predicted FC values based solely on distance from the measured FC values. The corrected FC was still frequency-selective within and between hemispheres (Fig. 6; LL: L = 382, χ2 = 26.65, P < 0.0001; RR: L = 385, χ2 = 28.00, P < 0.0001; LR: L = 383, χ2 = 26.43, P < 0.0001), indicating that the distance effect is not sufficient to account for the frequency selectivity.
Stimulus Effect and Resting-Epoch FC
Although we subtracted the predicted response accounted for by the stimulus from the fMRI response, a stimulus effect might still remain due to incomplete model fit. To control for this possibility, we used only the residual fMRI activity from the fourth TRs of each block. The activity captured in these TRs can be considered as resting or on-going activity rather than stimulus-driven activity since it is collected 18 s after the previous stimulus presentation. The frequency selectivity of this “resting-epoch FC” was significant for all the within- and between-hemisphere conditions (Fig. 7A; LL: L = 382, χ2 = 25.65, P < 0.0001; RR: L = 384, χ2 = 27.21, P < 0.0001; LR: L = 383, χ2 = 26.42, P < 0.0001). Furthermore, we regressed out again the model responses to stimulus conditions from the resting-epoch time series in order to remove any remaining effect of the preceding stimulus condition. FC using the baseline activity with the stimulus effect regressed out was still frequency-selective (Fig. 7B; LL: L = 377, χ2 = 21.97, P < 0.0001; RR: L = 384, χ2 = 27.21, P < 0.0001; LR: L = 382, χ2 = 25.65, P < 0.0001).
Frequency Selectivity of FC in the Core and the Noncore Fields, and Hemispheric Differences
Upon confirming frequency selectivity of FC in the auditory cortex within and between hemispheres, including with intervoxel distance and the stimulus effect controlled, we next tested frequency selectivity of FC in the core and the noncore areas separately. We divided these areas to obtain 6 different pairs of areas within the hemispheres (Fig. 8A) and 4 pairs between the hemispheres (Fig. 8B). The results show that there is significant frequency selectivity for every pair (LC–LC: L = 349, χ2 = 6.61, P < 0.05; RC–RC: L = 383, χ2 = 26.42, P < 0.0001; LN–LN: L = 381, χ2 = 24.89, P < 0.0001; RN–RN: L = 358, χ2 = 28.00, P < 0.0001; LC–LN: L = 375, χ2 = 20.57, P < 0.0001; RC–RN: L = 382, χ2 = 25.65, P < 0.0001; LC–RC: L = 362, χ2 = 12.62, P < 0.0005; LN–RN: L = 384, χ2 = 27.21, P < 0.0001; LC–RN: L = 371, χ2 = 17.92, P < 0.0001; RC–LN: L = 383, χ2 = 26.42, P < 0.0001). Therefore, FC in the auditory cortex is frequency selective within and between areas, both within and between the hemispheres.
We then investigated whether the degree of frequency selectivity of local FC differs between core versus noncore areas, and whether it differs between left and right hemispheres. We quantified frequency selectivity as the negative slope of a linear function fit to FC data as a function of Δ (preferred) frequency for the 4 areas (LC, RC, LN, and RN), shown in Figure 9A. Permutation tests indicated that FC in the right core area had a higher frequency selectivity than the left core (P < 0.05; also, Wilcoxon signed-rank test: W = 27.0, P < 0.05) and then the noncore in the same hemisphere (permutation test: P < 0.05; Wilcoxon signed-rank test: W = 28.0, P < 0.05). There was no difference between the noncore areas of the 2 hemispheres (permutation test: P = 0.29; Wilcoxon signed-rank test: W = 18.0, P = 0.29) or between the 2 areas within the left hemisphere (permutation test: P = 0.32; Wilcoxon signed-rank test: W = 17.0, P = 0.344). The same analysis applied to the resting-epoch FC with the stimulus effect regressed out (corresponding to the data in Fig. 7B) showed a similar pattern (Fig. 9B) although the statistical significance was marginal (RC vs. LC: P = 0.08; RC vs. RN: P = 0.11; RN vs. LN: P = 0.24; LC vs. LN: P = 0.63). Thus, the right core field has higher frequency selectivity in its within-area FC than the other areas, notably its homolog in the left hemisphere.
Our study is, to our best knowledge, the first attempt to link intrinsic FC and frequency selectivity in the human auditory cortex. Our findings demonstrate that intrinsic FC measured with fMRI in human auditory cortex is organized in accordance to frequency preference of voxels. Residual and resting epoch activity in voxels with similar frequency preferences was more strongly correlated than in voxels with dissimilar frequency preferences. This correlation was not explained by generic FC in the activity of the gray matter voxels, or by residual stimulus effects. Furthermore, we observed that the intrinsic FC is significantly stronger within the core fields of the right hemisphere as compared with the left core, or the right noncore fields.
Frequency-Selective FC in the Auditory Cortex and Relation to Prior Studies
The existence of frequency selectivity of intrinsic FC in the human auditory cortex is in line with findings in other human sensory/motor cortices and nonhuman auditory cortex. Given the topographic organization of frequency preference (Merzenich and Brugge 1973; Romani et al. 1982; Morel et al. 1993; Wessinger et al. 2001), frequency-selective FC is the auditory analog of topographically organized FC previously found in the visual and somatosensory/motor cortices. In the visual cortex, retinotopy-specific FC was observed between early visual areas and ipsilateral and contralateral MT regions in the monkey (Vincent et al. 2007), and between early visual areas (Heinzle et al. 2011; Haak et al. 2012). Somatotopy-specific spatial organization of intrinsic FC was also found in the motor cortex of the human and in the somatosensory cortex of the squirrel monkey (van den Heuvel and Hulshoff Pol 2010; Cauda et al. 2011; Chen et al. 2011). It is notable that in these studies, resting-state fMRI activity was used for computing FC, and thus the measured FC is intrinsic and not explainable by stimulus input or motor output. Although we used residual activity instead of using resting-state data, the previous literature and our own analysis strongly suggest that the FC we measured is intrinsic. There is considerable evidence that residual fMRI activity is highly correlated with spontaneous activity (Fox et al. 2006, 2007; Saka et al. 2010; Becker et al. 2011). In line with these previous findings, we empirically demonstrated that our residual FC was intrinsic and not accounted for by stimulus-driven activity, and an additional regress-out did not remove frequency-selective pattern of FC (Fig. 7; see below for further discussion). Therefore, our results supplement evidence for a unifying proposal that intrinsic FC in the cerebral cortex is spatially organized with respect to sensory/motor tuning properties and/or their topographic organization (Jbabdi et al. 2013).
Our findings of frequency-selective FC are also consistent with those in electrophysiological studies in the nonhuman animal auditory cortex. Brosch and Schreiner (1999) found that correlation in spontaneous neural activity is strengthened as a function of similarity in receptive field properties in the cat primary auditory cortex. In guinea pigs, spontaneous activity was shown to have similar spatio-temporal patterns with tone-evoked activity (Saitoh et al. 2010). Rothschild et al. (2010) also reported frequency-selective coherence in both residual activity and on-going spontaneous activity in mice, which is closely related to our paradigm: both correlation in residual neural activity during tone-stimulation and correlation in on-going activity taken from pre-stimulation time windows increased when the correlated neurons had similar frequency selectivity. Finally, Fukushima et al. (2012) demonstrated tonotopy-specific FC in spontaneous activity in macaque monkeys in resting-state data. Using micro-electrocorticographic arrays, they were able to find that the spatial pattern of high-gamma-band voltage signals is coherent with the characteristic frequency maps along the supratemporal plane of the lateral sulcus. Considering that fMRI signal is not only correlated with evoked neural activity (Logothetis et al. 2001) but also is related to spontaneous activity, and that very slow fluctuations of gamma-band power in electrophysiological signals are correlated with spontaneous fMRI activity (Leopold et al. 2003; Nir et al. 2008; Shmuel and Leopold 2008; Schölvinck et al. 2010), our findings of frequency selectivity in residual and ongoing fMRI activity are consistent with those in previous animal studies.
Functional Role of Functional Connectivity in Stimulus Encoding
While the functional role of frequency-selective FC, or feature-selective FC in general, is yet unclear, the importance of temporal coherence in activity of sensory neurons has been discussed with respect to population coding and decoding of stimulus features (Fries 2005; Kohn and Smith 2005; Averbeck and Lee 2006; Pillow et al. 2008; Stevenson et al. 2012). For example, Stevenson et al. demonstrated that the activity of neurons simultaneously recorded in various cortical areas can be better predicted by a stimulus encoding model that incorporates FC, measured as trial-to-trial correlations (“noise” correlations), than a model that accounts only for tuning functions. It is notable that in their study the correlations between neurons increased with tuning similarity, which is consistent with the present data. Incorporation of FC not only improved prediction accuracy in encoding, but also accuracy in decoding the presented stimuli. In other studies, behavioral performance of animal subjects was also related to trial-to-trial fluctuations and correlations in spiking activity (Bair et al. 2001; Pesaran 2010). Despite the difference in measurement, trial-to-trial fluctuations or spontaneous fluctuations in fMRI signals seem to have similar functional significance as the aforementioned neurophysiological data for 3 reasons (see Kohn et al. 2009): first, fast fluctuations observed in spiking activity or field potentials are nested in slow BOLD fluctuations as discussed above; secondly, trial-to-trial fluctuations or spontaneous fluctuations in fMRI signals are also predictive of behavioral performance (e.g., Fox et al. 2007; Monto et al. 2008), and perception (e.g., Boly et al. 2007); and finally, recent successes of using multi-voxel encoding and decoding models in fMRI analysis in predicting neural activity and behaviors indicate the importance of temporal coherence (covariance) information between voxels (Naselaris et al. 2011; Serences and Saproo 2012). Therefore, correlations in residual activity and resting-epoch activity are likely to have functional significance in encoding and decoding stimulus features. The role of FC in processing of stimulus features is important to interpreting our results with respect to functional hierarchy and asymmetry, as discussed below.
Controlling for Intervoxel Distance and Stimulus Effects
Considering the topographic organization of frequency selectivity in the cortex, the relationship between FC and frequency preference that we observed in our data could have been caused by an artifact due to point spread of fMRI signal (Engel et al. 1997; Parkes et al. 2005), or spatial extent of generic FC that might not be specific to frequency selectivity (Young et al. 1992; Leopold et al. 2003; Bellec et al. 2006; Honey et al. 2009; Schölvinck et al. 2010). The data presented in Figures 5 and 6, however, do not support this interpretation. Firstly, the spatial extent of correlation in the nonauditory gray matter is still shorter than that in the auditory ROIs, which suggests that the FC in the ROIs has an additional component beyond this generic FC. Secondly, the distance between the hemispheres (Fig. 5A) is well beyond 25 mm, the point at which the FC in the nonauditory gray matter reaches zero-asymptote, but we still observed significant interhemispheric FC. Furthermore, we regressed out the portion of FC that could be explained by the spatial function of the generic FC in the nonauditory gray matter to remove any bias in estimation of frequency selectivity in FC, and the corrected FC was still frequency-selective (Fig. 6). We therefore conclude that the observed frequency-selective FC is not due to artifacts related to spatial smoothness of fMRI or generic FC.
Another possible confounding factor could have arisen from stimulus-driven effects. The residual activity used to compute FC in our study is not supposed to be correlated with stimulus effects if the GLM is a feasible statistical method, since it assumes independence between the explanatory variables and the error. Nevertheless, there might be residual stimulus effect due to nonlinearity in hemodynamic response or any systematic error in model fitting. For this reason, we tested frequency selectivity of FC based on the calculation of correlations in the fourth TRs of each block of the residual time-series. The gap between the offset of the preceding stimulus and the fourth TR is 18 s so that these time points would reflect on-going spontaneous activity that is lurking under evoked responses (Fox et al. 2006; Saka et al. 2010; Becker et al. 2011; see the discussion above). We found that the FC computed using the data in this time window is still frequency-selective (Fig. 7). This result is in agreement with Rothschild et al. (2010)'s results where FC in residual activity during stimulus presentation (as our residual FC data) and on-going activity in pre-stimulus time windows (as our resting epoch FC data) are both frequency-selective.
Fair et al. (2007) addressed whether residual fMRI activity and activity taken from interleaved resting epochs in evoked response experiments can act as a surrogate for conventional resting-state data. Their analysis showed that FC using residuals of an event-related fMRI dataset has qualitatively similar but quantitatively different topography from resting-state FC, whereas fMRI activity taken from interleaved resting epochs between stimulations provides quantitatively similar patterns with FC from continuous resting-state scans. Moreover, the regions that showed significant correlations in the event-related dataset but not in the resting-state dataset overlapped with the regions that were activated by the task. This might be caused by remnants of task effects or inconsistency in the pattern of intrinsic FC during task performance and at rest. According to Fair et al., our results of frequency selectivity in residual FC might be confounded by the stimulus effects or might reflect a different pattern of FC than that of resting-state FC. However, our analysis of resting epoch data suggests that the FC measures we used in this study would be comparable to that of continuous resting-state scans, and thus intrinsic (Fig. 7). Furthermore, the additional regress-out procedure indeed removes any bias or error due to poor fitting of the hemodynamic response within a block since only one frame per block was taken (Fig. 7B).
Implications of Frequency-selective FC for the Functional Architecture of Human Auditory Cortex
Our findings of frequency selectivity of intrinsic FC are closely related to 3 important aspects of human auditory cortex in its functional architecture and information processing: frequency-selective connectivity, hierarchical organization, and hemispheric asymmetry. First of all, frequency selectivity of auditory neurons has been thought to be based on tonotopic or frequency-selective projections that are thalamocortical (McMullen and de Venecia 1993; Hashikawa et al. 1995; Miller et al. 2001; Kimura et al. 2003; Lee et al. 2004), corticocortical (Read et al. 2001; Lee et al. 2004), and commissural (Code and Winer 1985; Rouiller et al. 1991; Lee et al. 2004). While tonotopic organization in the human auditory cortex has been well documented (Romani et al. 1982; Formisano et al. 2003; Talavage et al. 2004; Humphries et al. 2010; Saenz and Langers 2014), anatomical connectivity in human brains has rarely been studied due to the methodological limits. Considering that the pattern of intrinsic FC is likely constrained by anatomical connectivity (Fox and Raichle 2007; Vincent et al. 2007; Van Dijk et al. 2010), frequency-selective FC in our results suggests that anatomical connectivity in the human auditory cortex is also frequency-selectively organized, in accordance with evidence from diffusion imaging (see Upadhyay et al. 2007).
In the framework of hierarchical processing (Rauschecker 1998; Kaas et al. 1999; Wessinger et al. 2001; Kumar et al. 2007; Rauschecker and Scott 2009), sensory features represented in the cortex become more complex and integrative in the later stages of the processing stream. This principle predicts that frequency tuning becomes wider and less selective as the order of hierarchy increases. How do our results relate to the difference in frequency selectivity in terms of response amplitude between cortical processing levels? One simple hypothesis is that the emergence of sharp tuning functions is dependent on local FC within a given area. Conversely, the local FC could be simply a byproduct of response amplitudes. Both predict that the selectivity of FC will follow the selectivity in response amplitude. In other words, the frequency selectivity of FC in the core-fields would be higher than the one in the noncore-fields area. Our results partially support this prediction: only right core-fields area has particularly high selectivity in its within-area FC compared with the other areas (Figs 8 and 9). Also, our control analyses in which FC of the resting epochs was used and stimulus effects were regressed out (Fig. 7) rule out the possibility that FC is a simple reflection or a byproduct of amplitude structures that emerge locally.
Our results may offer an important new conclusion in the context of understanding the functional asymmetry of human auditory cortex. Many prior studies have suggested that there exists a functional asymmetry favoring the right auditory cortex in fine-grained tonal or spectral processing, and the left in temporal processing (Zatorre et al. 2002). For example, Zatorre and Belin (2001) demonstrated that cerebral blood flow in the right superior temporal gyrus and sulcus was more sensitive to differences in spectral separation of a series of pure tones than the left auditory cortex; these findings were subsequently replicated with BOLD signal measures (Jamison et al. 2006). The general conclusion of right auditory cortex advantage for fine-grained spectral processing has been supported by various other studies using related approaches (Patterson et al. 2002; Zatorre et al. 2002; Schönwiesner et al. 2005; Hyde et al. 2008). The finding of greater intrinsic FC in right compared with left core areas adds significantly to this body of literature by suggesting a possible mechanism by which functional asymmetries might emerge. A given voxel in the right auditory cortex is more likely to be connected to another voxel with similar spectral tuning than on the left side; therefore encoding of fine-grained spectral information would tend to be enhanced, since neurons with similar tuning properties would be more functionally interconnected to one another. Conversely, on the left side, we speculate that the relative lack of such frequency-selective connectivity might reflect integration across frequencies, which could instead enhance temporal resolution.
There is one potential discrepancy, however, between the present data and those obtained earlier: whereas the present lateralized effects were limited to the right core area, prior studies largely reported asymmetries in noncore regions. For example, Schönwiesner et al. (2005) reported that the covariation of spectral complexity and fMRI activity was greatest in the antero-lateral belt on the right superior temporal gyrus; and Hyde et al. (2008) identified the right planum temporale to be sensitive to pitch variation in a tonal sequence. The particularly high frequency selectivity in FC of the right core area rather invites a hypothesis that incorporates both the hierarchical processing model and functional asymmetry: highly frequency-selective FC in the right core area may contribute to asymmetric responses to tonal variations in the later processing stages. In other words, a possible mechanism underlying this effect is that the right core area passes finer frequency information formed by temporal coherence to higher-order auditory areas. This idea is supported by empirical evidence and theoretical considerations. First of all, spectral information processing in the secondary auditory cortex seems to be greatly dependent on the primary auditory cortex in nonhuman animals. Neurons in the caudomedial area of rhesus monkeys have their responses abolished after deactivation of the primary auditory cortex (Rauschecker et al. 1997). Similarly, deactivation of the cat primary auditory cortex yields reduction in response strength and receptive field bandwidth for pure tone stimuli in the anterior and the posterior auditory fields (Carrasco and Lomber 2009a, 2009b). Anatomical tracer injection studies in cats and primates also imply frequency-selective inputs in the noncore fields that mostly originate from the core fields: the noncore fields receive most thalamic inputs from the middle and dorsal divisions of medial geniculate complex whose cells are known to be weakly frequency-selective and the topographically organized inputs from the core areas suggest tonotopic organization of the corticocortical inputs (de la Mothe et al. 2006a, 2006b, 2012a, 2012b; Hackett 2011; Lee and Winer 2011). Also, it has been reported that most cortical connections to the posterior auditory field in the cat originate in the primary auditory cortex (Lee and Winer 2008). These data, taken together suggest that spectral processing in the noncore areas in the right hemisphere is dependent on spectral processing in the ipsilateral core fields. Considering the role of FC as discussed above, this hierarchical organization of auditory cortex and the functional asymmetry of FC in our results support the idea that sharp frequency selectivity of FC in the right core area feeds forward to enable or support the enhanced spectral processing and the emergence of functional asymmetry at the later cortical stages.
Frequency selectivity of auditory neurons in early cortical areas is critical to spectral analysis and perception. We demonstrated that intrinsic FC in the human auditory cortex is frequency-selective, by correlating residual activity on a voxel-by-voxel basis. This pattern is neither explained by generic FC that is correlated with spatial distance nor by stimulus effects. The data in resting epochs maintained the frequency-selective FC even after controlling possible residual stimulus effects. Frequency-selective FC in the auditory cortex is consistent with the previous studies that suggest intrinsic FC is constrained by functional and anatomical organization of the cortex. We also found that frequency selectivity of FC was significantly higher in the right core-fields area than the left and the noncore areas. This finding suggests that frequency-selective temporal fluctuation in the right core-fields has important roles in spectral analysis in higher order areas in the right hemisphere, that are already known to be specialized to process spectrally complex stimuli compared with the counter-parts in the left.
This work was supported by a grant from the Canadian Institutes of Health Research to R.J.Z. and M.S. K.C. was supported by the National Science and Engineering Council of Canada. M.S. was supported by a Research Scholar grant from the Fonds de la Recherche en Santé du Québec.
Conflict of Interest: None declared.