Abstract

Binaural pitches are auditory percepts that emerge from combined inputs to the ears but that cannot be heard if the stimulus is presented to either ear alone. Here, we describe a binaural pitch that is not easily accommodated within current models of binaural processing. Convergent magnetoencephalography (MEG) and psychophysical measurements were used to characterize the pitch, heard when band-limited noise had a rapidly changing interaural phase difference. Several interesting features emerged: First, the pitch was perceptually lateralized, in agreement with the lateralization of the evoked changes in MEG spectral power, and its salience depended on dichotic binaural presentation. Second, the frequency of the pure tone that matched the binaural pitch lay within a lower spectral sideband of the phase-modulated noise and followed the frequency of that sideband when the modulation frequency or center frequency and bandwidth of the noise changed. Thus, the binaural pitch depended on the processing of binaural information in that lower sideband.

Introduction

The pitch of a sound is usually determined by its temporal, spectral, and harmonic characteristics. But in a “binaural pitch,” the corresponding neural excitation pattern only emerges through the combination of signals from both ears—the pitch is not heard in either monaural stimulus. Binaural pitches have provided useful insights into auditory processing in much the same way as visual illusions helped us to understand visual processing. Here, we describe a new kind of binaural pitch that prompts a reinterpretation of the auditory processing of rapidly changing binaural cues.

The principal functions of binaural hearing are sound localization and the improved detectability and discrimin-ability of a signal heard in spatially distributed competing backgrounds—the “cocktail party effect.” Interaural time differences and interaural level differences (ITDs and ILDs) 1) provide cues to sound location in the horizontal plane and 2) help listeners to partially isolate a source from one location against a background of competing sources in other locations. The size of our head and the speed of sound jointly determine the interaural differences that occur naturally—the maximum ITD for a human head in air, for example, is about 660 μs. Additional information arises when we turn our heads from side to side; our auditory system processes the dynamic changes in ITD and ILD as our ears move within the “auditory world” around us—but physiology limits the speed at which we can turn our head (and hence the “natural” rate of change in ITD) to a maximum of about 1800°/s and that only very briefly! ITDs in narrowband stimuli are sometimes represented as interaural phase differences or delays (IPDs), and we shall follow that convention.

In the laboratory, the excursion and rate of change in binaural cues can be artificially manipulated to help develop and test models of binaural hearing. A landmark study by Grantham and Wightman (1978), exploring the perception of dynamic ITD cues, noted that the sluggishness of the binaural system in tracking auditory movement and that the mechanisms for the processing of rapid dynamic change (oscillations above about 20 Hz) were different from those involved at lower rates and smaller excursions; no explanation of the underlying mechanism has since been suggested. Based on the data presented here, we suggest that at high rates of changing IPD, listeners may actually detect the changes by hearing a binaural pitch that emerges because of the way in which information in the spectral sidebands of binaurally phase-modulated sounds are processed. When presented with a stimulus with a relatively high rate of interaural phase modulation in a previous study (Witton et al. 2005), our listeners reported hearing a pitch, localized to one side by one observer and to the other side by the other observer despite there being no a priori reason to expect any lateralized pitch to be elicited. This pitch is the principal topic of this paper.

Neuroimaging studies have provided evidence for a pitch-selective area located posterior to primary auditory cortex, in planum temporale (Hall and Plack 2009). This area is activated by a range of pitch-inducing stimuli including Huggins' pitch (a binaural pitch) and harmonic complexes with missing fundamental. Further evidence for cortical pitch selectivity has been provided by the measurement of “pitch-onset responses” with electroencephalography (e.g., Krumbholz et al. 2003) and magnetoencephalography (MEG) (Hertrich et al. 2005; Chait et al. 2006), where a pitch arises partway through a noise stimulus. It has been reported that pitch-onset responses may have greater amplitude in the left hemisphere (Chait et al. 2006), although this has not been extensively studied in individuals.

The present study therefore set out to determine whether rapidly changing IPDs generally elicit a binaural pitch in listeners, whether a cortical locus associated with such a binaural pitch could be established, and whether laterality of the cortical response to pitch corresponds to perceptual laterality. We also used behavioral measures to determine how the pitch depended on the center frequency and bandwidth of the modulated noise and on the rate of phase modulation. The characteristics of this new binaural pitch suggest an explanation for Grantham and Wightman's original findings, and for those of Witton et al. (2005), and lead to new insights into the processing of rapidly changing binaural cues.

Methods, Stimuli, Procedures, and Subjects

Stimuli

The stimuli used here, and by Grantham and Wightman, differ from previous binaural pitch stimuli because the interaural phase changes involved are continually changing and not restricted to a fixed spectral band within the stimulus. Three other, different, and previously studied binaural pitches are Huggins’ pitch (Cramer and Huggins 1958)—created when an IPD is introduced over a narrow frequency band within a broadband noise; binaural edge pitch (Klein and Hartmann 1981) and binaural coherence edge pitch (Hartmann and McMillon 2001) that both result from the introduction of an IPD either above or below a certain frequency “edge” within a band of noise. These pitches are illustrated in Figure 1a,b, which show the mean power of a band of noise as gray rectangles and the location of an interaural phase change (dotted black lines). In each case, the percept of pitch (with matching pure-tone frequency at the star) emerges near the frequency at which the change in interaural phase or correlation occurs. Our stimulus is a band of noise where each spectral component is sinusoidally phase modulated, with a 180° phase delay between the phase modulation in each ear. This creates a dynamic IPD. Spectral sidebands are produced by the phase modulation as illustrated for a single sinusoidal component in Figure 1c where amplitude is plotted as a function of frequency. The spectrum of a sinusoidal carrier (frequency fc Hz) modulated to a depth of 0.3 at a rate of fm Hz is shown. Sidebands are created at integer multiples of fm below and above fc but their amplitudes fall rapidly when the modulation index is small (Goldman 1948). When each component of a band of noise is modulated, the sidebands overlap with the noise components but extend beyond the spectrum of the unmodulated noise. In the dichotic stimulus we use, the modulation is out of phase between the ears, so the IPD extends across the spectrum as illustrated in Figure 1d which also shows, in light gray, the regions where the principal sidebands extend beyond the noise band.

Figure 1.

Simplified schematic diagrams to illustrate 2 well-described binaural pitches: Huggins pitch (a) and a binaural edge pitch (b). The 2 figures show mean power as a function of frequency with the power spectra of the noise stimuli shown as gray rectangles (left ordinate) and the IPDs indicated by black dots (right ordinate). The purte-tone frequency matching the pitch produced by the stimulus is indicated by a star near the frequency of the abrupt interaural phase shifts. In Huggins' pitch (a), the IPD varies over a narrow frequency range while in an edge pitch (b), there is an abrupt IPD “edge.” (c) The amplitude spectrum of a sinusoidal carrier of frequency fc Hz that has been sinusoidally phase modulated at a frequency fm. Spectral sidebands emerge above and below the carrier frequency at multiples of the modulation rate. The amplitude of the sidebands and carrier depend nonlinearly on the modulation index; for the index used here (0.3), only the first pair of sidebands has significant amplitude. (d) The novel stimulus, where each component of the band-limited noise (dark gray rectangle) is phase modulated so the spectral sidebands (light gray) extend beyond the spectrum of the unmodulated noise. The sidebands have opposite phase in each ear so the interaural phase differences or delay (IPD, right ordinate) extending beyond the unmodulated noise is, in reality, greatest at frequencies where the spectrum is composed only of sidebands.

Figure 1.

Simplified schematic diagrams to illustrate 2 well-described binaural pitches: Huggins pitch (a) and a binaural edge pitch (b). The 2 figures show mean power as a function of frequency with the power spectra of the noise stimuli shown as gray rectangles (left ordinate) and the IPDs indicated by black dots (right ordinate). The purte-tone frequency matching the pitch produced by the stimulus is indicated by a star near the frequency of the abrupt interaural phase shifts. In Huggins' pitch (a), the IPD varies over a narrow frequency range while in an edge pitch (b), there is an abrupt IPD “edge.” (c) The amplitude spectrum of a sinusoidal carrier of frequency fc Hz that has been sinusoidally phase modulated at a frequency fm. Spectral sidebands emerge above and below the carrier frequency at multiples of the modulation rate. The amplitude of the sidebands and carrier depend nonlinearly on the modulation index; for the index used here (0.3), only the first pair of sidebands has significant amplitude. (d) The novel stimulus, where each component of the band-limited noise (dark gray rectangle) is phase modulated so the spectral sidebands (light gray) extend beyond the spectrum of the unmodulated noise. The sidebands have opposite phase in each ear so the interaural phase differences or delay (IPD, right ordinate) extending beyond the unmodulated noise is, in reality, greatest at frequencies where the spectrum is composed only of sidebands.

Each stimulus was a 1-s long band-limited white noise. Spectral components were spaced at 1-Hz intervals and had random (approximately uniformly distributed) phase and random (approximately Rayleigh distributed) amplitude. Sinusoidal phase modulation could be introduced on each component. Identical complexes were presented to each ear, except that the phase modulation could be either diotic or dichotic. In the diotic condition, the phase modulation was identical for each ear, but, in the dichotic condition, there was a 180° phase shift between the phase modulation applied to the noise band common to the left and right ears. Thus, the dichotic stimuli effectively contained an interaural phase modulation at the modulation rate. The band-limited noise was generated afresh for each presentation of the stimuli, and all stimuli were gated with 1-ms raised cosine rise and fall times. In the MEG recordings, as well as in the pitch-matching experiments, the onset of phase modulation occurred 500 ms after the stimulus onset and continued for the remaining 500 ms; the onset of phase modulation was again gated with a 1-ms raised cosine rise and fall times.

Figure 2 shows power spectra of a 250-Hz-wide unmodulated band of noise centered on 500 Hz (black) and the same noise modulated at 60 Hz with a modulation index of 0.3 (light gray). Power, in decibels, is plotted against frequency. During modulation, the energy in the carrier is redistributed through spectral sidebands placed above and below the components of the noise at multiples of the modulation rate, thus additional spectral components (illustrated previously for a sinusoidal carrier in Fig. 1c and for the noise in Fig. 1d) are added to the noise when modulation is introduced. The distribution of energy between each sideband and carrier is determined by a Bessel function of the modulation index (Goldman 1948). At the relatively small modulation depths used here, only a single pair of spectral sidebands contain significant energy, and the overall spectral energy is unchanged (Goldman 1948). The onset of modulation in our neuroimaging stimuli is therefore accompanied by no change in the total spectral power but an effective increase in stimulus bandwidth of twice the modulation frequency, which could theoretically evoke a neurophysiological response despite being perceptually nonsalient. However, the diotic and dichotic stimuli both contain identical phase modulation (and the accompanying increase in bandwidth), so they are matched both in terms of total spectral energy and in the spectral distribution of energy; they differ only in the interaural phase angle of their spectral components.

Figure 2.

The level (dB) of a sample of an unmodulated 250-Hz-wide band of noise centered on 500 Hz (black) together with the same noise phase modulated at 60 Hz and having a modulation index of 0.3 (gray)—both as a function of frequency. When the phase modulation is 180° out of phase between the ears, a pitch corresponding to 331 Hz emerges with this stimulus. The dotted shape illustrates a “critical band” filter centered on that frequency.

Figure 2.

The level (dB) of a sample of an unmodulated 250-Hz-wide band of noise centered on 500 Hz (black) together with the same noise phase modulated at 60 Hz and having a modulation index of 0.3 (gray)—both as a function of frequency. When the phase modulation is 180° out of phase between the ears, a pitch corresponding to 331 Hz emerges with this stimulus. The dotted shape illustrates a “critical band” filter centered on that frequency.

MEG Recordings and Data Analysis

Nine participants, 6 females, all of whom reported normal hearing and no neurological problems, participated in the neuroimaging study. Of the participants, 2 were experienced listeners (authors G.B.H. and C.W.). MEG recordings were made using a 275-channel CTF MEG system with third-order axial gradiometers while participants were seated with their eyes open. Stimuli were presented through echoless plastic tubing with foam ear inserts. One hundred exemplars of each of the diotic and each of the dichotically modulated noise stimuli were presented in pseudorandom order with a 1-s silent interstimulus interval. The recorded data were segmented into 1.5-s epochs starting 500 ms before the stimulus onset. Each epoch was baseline corrected using the prestimulus period and comb filtered to remove the 50-Hz power line artifacts.

For initial summary analysis of the data from the MEG sensors, global field power (root mean square across channels; Fig. 3) was computed separately for the left and right hemispheres, and 1-Hz high-pass and 30-Hz low-pass filters applied (Lehmann and Skrandies 1980). This approach combines information across channels and rectifies the time series, allowing simple calculation of the peak latency of the major evoked components. The data showed typical responses to the onset of sound energy (Naatanen and Picton 1987); all participants showed a salient N1m/P2m complex within the first 250 ms following the onset of stimulation (N1m, mean latency 112 ms [standard error mean, SEM 4 ms] both in the left and in the right hemispheres; P2m, mean latency 207 ms [SEM 8 ms] in the left and 204 ms [SEM 6 ms] in the right hemisphere; see also Table 1).

Table 1

The P2m response (latency and amplitude) and the response evoked by the onset of dichotic modulation (latency and amplitude) all calculated from individuals' averaged global field power traces are presented separately for each hemisphere and for all listeners

Listener (a) (b) (c) (d) 
P2m P2m Dichotic-modulation response Dichotic-modulation response 
Latency (ms) Amplitude (standard deviation above baseline) Latency (ms) Amplitude (standard deviation above baseline) 
Left Right Left Right Left Right Left Right 
1a 242 196 9.0 10.3 160 183 3.3 7.3 
218 211 3.6 14.6 203 185 3.1 4.3 
228 233 5.7  5.1 183 218 3.2 5.3 
175 200 0.4  3.3 233 203 2.3 0.4 
5a 203 222 1.0  2.3 192 198 0.6 2.9 
230 198 0.4  1.5 180 193 5.9 2.6 
205 178 9.4  9.4 No visible peak 177  . 2.8 
183 185 5.3  4.7 187 172 2.6 8.3 
178 215 7.7  5.0 168 165 9.4 3.5 
Mean ± SEM 207 ± 24 204 ± 18 4.7 ± 3.6  6.2 ± 4.3 188 ± 22 188 ± 17 3.4 ± 2.8 4.2 ± 2.5 
Listener (a) (b) (c) (d) 
P2m P2m Dichotic-modulation response Dichotic-modulation response 
Latency (ms) Amplitude (standard deviation above baseline) Latency (ms) Amplitude (standard deviation above baseline) 
Left Right Left Right Left Right Left Right 
1a 242 196 9.0 10.3 160 183 3.3 7.3 
218 211 3.6 14.6 203 185 3.1 4.3 
228 233 5.7  5.1 183 218 3.2 5.3 
175 200 0.4  3.3 233 203 2.3 0.4 
5a 203 222 1.0  2.3 192 198 0.6 2.9 
230 198 0.4  1.5 180 193 5.9 2.6 
205 178 9.4  9.4 No visible peak 177  . 2.8 
183 185 5.3  4.7 187 172 2.6 8.3 
178 215 7.7  5.0 168 165 9.4 3.5 
Mean ± SEM 207 ± 24 204 ± 18 4.7 ± 3.6  6.2 ± 4.3 188 ± 22 188 ± 17 3.4 ± 2.8 4.2 ± 2.5 

Note: Means ± 1 standard deviation are given in the bottom row. Column (a) shows latency of the left and right hemispheres' P2m-evoked response. Column (b) shows the amplitude of the P2m response expressed as the standard deviation above the baseline from the silent period prior to the stimulus onset. Columns (c) and (d) show corresponding data for the onset of dichotic modulation.

a

Left-handed listeners.

Figure 3.

MEG global field power for responses to phase-modulated noise stimuli—dichotic (a, upper panels) and diotic (b, lower panels). Data for the left and right hemispheres are shown separately and each colored trace represents data from an individual participant. In each plot, the stimulus onset was at 0 s and phase modulation commenced at 0.5 s; both times are indicated by vertical dotted lines. All 4 plots show clear N1m and P2m responses to the onset of sound, but responses to the onset of phase modulation are only seen in the dichotic condition.

Figure 3.

MEG global field power for responses to phase-modulated noise stimuli—dichotic (a, upper panels) and diotic (b, lower panels). Data for the left and right hemispheres are shown separately and each colored trace represents data from an individual participant. In each plot, the stimulus onset was at 0 s and phase modulation commenced at 0.5 s; both times are indicated by vertical dotted lines. All 4 plots show clear N1m and P2m responses to the onset of sound, but responses to the onset of phase modulation are only seen in the dichotic condition.

Volumetric source analysis was then performed using a beamformer method, synthetic aperture magnetometry (SAM) (Robinson and Vrba 1999), which uses the covariance matrix of the data to compute a set of weights which effectively apply a spatial filter to the data. SAM is most commonly used to localize nonstimulus locked or “induced” activity with particular time–frequency characteristics (for a review, see Hillebrand et al. 2005), but more recent implementations have developed this approach to localize evoked activity by calculating the average evoked power in the SAM time series for a particular latency (Cheyne et al. 2006, 2007). The benefit of using SAM is that it provides excellent artifact rejection (Adjamian et al. 2009), obviating the need to remove trials with eye movements, for example, and provides a strong signal-to-noise ratio in reconstructed time series. Additionally, and unlike in dipole modeling (Baillet et al. 2001), no a priori knowledge about the number of active sources is required. Beamformer methods tend to minimize sources if the time course of activity is tightly correlated (Hillebrand and Barnes 2005; Brookes et al. 2007) over more than 30–40% of the analysis time window (Hadjipapas et al. 2005). Evoked responses resulting from binaural stimulation, where activity in the left and right hemispheres may be tightly stimulus locked, are likely to result in transient correlation between sources, resulting in a loss of power in one or both hemispheres. We overcame this potential problem by separately analyzing the data for each side of the head (i.e., left- and right-hemisphere channels) (Herdman et al. 2003). Using this method to localize sources with sustained correlation, such as the auditory steady-state response, can result in localization errors (with sources being moved medially) (Popescu et al. 2008) but for transient evoked responses such as those reported here such errors are likely to be small (Hadjipapas et al. 2005). Importantly, analyzing left and right channel data separately should reduce any effects of transient correlation on the magnitude of pseudo-t values.

The MEG data set for each participant was spatially coregistered with a structural magnetic resonance imaging (MRI) using a modification of a previously described surface-matching approach (Adjamian et al. 2004). A source space with 5-mm voxels was derived from the scalp outline for each individual and a multisphere head model (Huang et al. 1999) was used to model the volume conduction. For the initial source model, the beamformer weights were computed over the 5- to 25-Hz frequency band and over 2 time periods chosen to capture: first, the evoked responses to the onset of the sound (50–250 ms) and second, the evoked responses to onset of the phase modulation (600–800 ms). This relatively broad frequency band was chosen based on pilot testing and confirmed by results such as illustrated in the spectrograms in Figure 4; restricting the analysis to traditional frequency bands such as theta did not yield consistent source models across the group because of individual differences in the peak spectral power of responses. Evoked beamformer activations were computed using the method described by Cheyne et al. (2006), referred to as event-related (ER) beamformer. For each individual, in each hemisphere separately, ER-beamformer images were computed at the latency of the peak N1m and P2m response and at the response to the onset of phase modulation, using the weights from the relevant time period (above). The time point of the ER-beamformer image therefore varied with the latency of the peak response across individuals and across hemispheres.

Figure 4.

Sensor data are shown for 2 example participants, one who reported a right-lateralized pitch and one for whom the pitch was left lateralized. (a,d) The bilateral dipolar field patterns observed at the peak of the response evoked by dichotic modulation. (b,c and e,f) Spectrograms of data from one sensor in the left and one in the right hemisphere for each individual, for data in the dichotic (central panels) and diotic (right panels) conditions. In each case, the sensor was taken from the peak of the posterior pole of the field pattern. Spectral power, computed using a Stockwell transform, is expressed as a percentage of maximum activity in the whole epoch across the whole frequency range shown. A prominent response to the onset of sound energy is seen in each spectrogram, between about 0 and 300 ms, and extending through the theta and alpha bands. A response to the onset of modulation is seen after about 600 ms in the spectrograms for the dichotic condition, in the left and right hemisphere for the first participant, for whom the pitch was right lateralized (ac) and predominantly in the right hemisphere for the second participant, for whom the pitch was left lateralized, in (df). There are individual differences in the frequency distribution of spectral power for this response across all participants, illustrated here by its restriction to the theta band in (f) but extending to the alpha range in (c).

Figure 4.

Sensor data are shown for 2 example participants, one who reported a right-lateralized pitch and one for whom the pitch was left lateralized. (a,d) The bilateral dipolar field patterns observed at the peak of the response evoked by dichotic modulation. (b,c and e,f) Spectrograms of data from one sensor in the left and one in the right hemisphere for each individual, for data in the dichotic (central panels) and diotic (right panels) conditions. In each case, the sensor was taken from the peak of the posterior pole of the field pattern. Spectral power, computed using a Stockwell transform, is expressed as a percentage of maximum activity in the whole epoch across the whole frequency range shown. A prominent response to the onset of sound energy is seen in each spectrogram, between about 0 and 300 ms, and extending through the theta and alpha bands. A response to the onset of modulation is seen after about 600 ms in the spectrograms for the dichotic condition, in the left and right hemisphere for the first participant, for whom the pitch was right lateralized (ac) and predominantly in the right hemisphere for the second participant, for whom the pitch was left lateralized, in (df). There are individual differences in the frequency distribution of spectral power for this response across all participants, illustrated here by its restriction to the theta band in (f) but extending to the alpha range in (c).

To produce the group images shown in Figure 5 and described in Table 3, each individual ER-beamformer image was normalized before the images were overlaid. The rescaling allows a simple visualization of spatial consistency in the images: if the magnitude of a peak in the group image were to equal the number of subjects, this would indicate that the activation patterns were identical. A large number means that the strongest activations across participants have similar spatial topographies. A smaller group peak means that, in some individuals, the maximum peak was elsewhere. This approach, although not a statistical approach, is more robust to the possible effects of small numbers of large activations (source–strength outliers) than are fixed-effects methods which essentially rely on averaging brain images. Table 3 shows the group peaks yielded with the spatial consistency metric, and also those from a fixed-effects group analysis, for confirmation. The fixed-effects analysis confirms the top activations in the left and right hemispheres from the consistency measure for all responses and places them in the same order for the P2m and the response to dichotic modulation.

Table 2

The laterality in the spectral power of the participants' brain responses and in their perception of pitch

Listener (a) (b) (c) 
Spectral power of pitch response in a peak channel (percentage) “Dichotic Advantage” (P values for t-tests on differences between responses to dichotic and diotic modulation) Reported perceptual laterality of the pitch 
Left Right Laterality Left Right Laterality 
1a 72 74 Bilateral P < 0.001 P < 0.001 Bilateral Right 
71 86 Right — P < 0.001 Right Left 
23 60 Right — P < 0.01 Right Left 
99 48 Left — P < 0.062 Right (n.s.). Left 
5a 14 45 Right P < 0.001 P < 0.001 Bilateral Bilateral 
81 56 Left P < 0.01 — Left Right 
19 50 Right n.s. P < 0.01 Right Left 
60 67 Bilateral P < 0.001 P < 0.001 Bilateral Bilateral 
74 48 Left P < 0.001 P < 0.01 Bilateral Bilateral 
Listener (a) (b) (c) 
Spectral power of pitch response in a peak channel (percentage) “Dichotic Advantage” (P values for t-tests on differences between responses to dichotic and diotic modulation) Reported perceptual laterality of the pitch 
Left Right Laterality Left Right Laterality 
1a 72 74 Bilateral P < 0.001 P < 0.001 Bilateral Right 
71 86 Right — P < 0.001 Right Left 
23 60 Right — P < 0.01 Right Left 
99 48 Left — P < 0.062 Right (n.s.). Left 
5a 14 45 Right P < 0.001 P < 0.001 Bilateral Bilateral 
81 56 Left P < 0.01 — Left Right 
19 50 Right n.s. P < 0.01 Right Left 
60 67 Bilateral P < 0.001 P < 0.001 Bilateral Bilateral 
74 48 Left P < 0.001 P < 0.01 Bilateral Bilateral 

Note: Column (a) shows the spectral power of the response to dichotic modulation from a single sensor at the peak of the dichotic-modulation response's field pattern. Spectral power, derived from Stockwell transforms (Fig 4) is expressed as a percentage of the maximum response to the dichotic-pitch stimulus relative to the maximum response computed over the whole epoch and between 1 and 100 Hz thereby accounting for left–right variability in overall signal. Laterality, summarized on the right-hand side of (a) is assumed to be bilateral if the left and right responses are within 10%. Column (b) summarizes data from the ER-beamformer source model. Where the ER beamformer yielded a source in the temporal lobe, a t-test was computed between the amplitudes of the response to dichotic modulation and the response to diotic modulation at this voxel. P values of the tests are shown. Dots in column (b) indicate that there was no temporal lobe source for that participant in that hemisphere. Only data for participant 4 was not consistent across all 3 columns. Column (c) shows the perceptual laterality of the pitch in the dichotic condition reported by each participant directly after the recording. n.s., not significant.

a

Left-handed listeners.

Table 3

The MNI coordinates for the centers of peak group activity for the 3 evoked responses that were analyzed; the N1m, the P2m, and the response to dichotic modulation

MNI coordinates Area Activation 
N1m response 
    Spatial coincidence 
        187, 110, 103 Right superior temporal gyrus 5.66 
        76, 128, 97 Left transverse temporal gyrus1 5.20 
        190, 143, 100 Right superior temporal gyrus2 4.38 
        142, 65, 88 Right superior temporal gyrus 3.21 
    Fixed-effects group analysis 
        76, 128, 98 Left transverse temporal gyrus1 78.02 
        198, 113, 106 Right superior temporal gyrus 62.67 
        190, 143, 100 Right superior temporal gyrus2 47.83 
P2m response 
    Spatial coincidence 
        184, 116, 103 Right superior temporal gyrus3 6.09 
        76, 122, 97 Left superior temporal gyrus4 5.34 
        187, 146, 97 Right superior temporal gyrus 4.02 
        73, 158, 103 Left middle temporal gyrus 3.29 
    Fixed-effects group analysis 
        187, 116, 103 Right superior temporal gyrus3 70.12 
        76, 122, 97 Left superior temporal gyrus4 63.17 
        197, 146, 97 Right superior temporal gyrus 48.91 
Response to dichotic modulation 
    Spatial coincidence 
        190, 128, 94 Right superior temporal gyrus5 6.60 
        67, 143, 103 Left superior temporal gyrus6 4.99 
        197, 146, 97 Right superior temporal gyrus 4.02 
        82, 116, 76 Left precentral gyrus 3.94 
        79, 83, 112 Left inferior frontal gyrus 3.76 
        82, 152, 64 Left inferior parietal lobule 3.17 
    Fixed-effects group analysis 
        193, 128, 94 Right superior temporal gyrus5 78.77 
        70, 143, 97 Left superior temporal gyrus6 59.48 
        76, 125, 76 Left precentral gyrus 44.40 
MNI coordinates Area Activation 
N1m response 
    Spatial coincidence 
        187, 110, 103 Right superior temporal gyrus 5.66 
        76, 128, 97 Left transverse temporal gyrus1 5.20 
        190, 143, 100 Right superior temporal gyrus2 4.38 
        142, 65, 88 Right superior temporal gyrus 3.21 
    Fixed-effects group analysis 
        76, 128, 98 Left transverse temporal gyrus1 78.02 
        198, 113, 106 Right superior temporal gyrus 62.67 
        190, 143, 100 Right superior temporal gyrus2 47.83 
P2m response 
    Spatial coincidence 
        184, 116, 103 Right superior temporal gyrus3 6.09 
        76, 122, 97 Left superior temporal gyrus4 5.34 
        187, 146, 97 Right superior temporal gyrus 4.02 
        73, 158, 103 Left middle temporal gyrus 3.29 
    Fixed-effects group analysis 
        187, 116, 103 Right superior temporal gyrus3 70.12 
        76, 122, 97 Left superior temporal gyrus4 63.17 
        197, 146, 97 Right superior temporal gyrus 48.91 
Response to dichotic modulation 
    Spatial coincidence 
        190, 128, 94 Right superior temporal gyrus5 6.60 
        67, 143, 103 Left superior temporal gyrus6 4.99 
        197, 146, 97 Right superior temporal gyrus 4.02 
        82, 116, 76 Left precentral gyrus 3.94 
        79, 83, 112 Left inferior frontal gyrus 3.76 
        82, 152, 64 Left inferior parietal lobule 3.17 
    Fixed-effects group analysis 
        193, 128, 94 Right superior temporal gyrus5 78.77 
        70, 143, 97 Left superior temporal gyrus6 59.48 
        76, 125, 76 Left precentral gyrus 44.40 

Group analysis identified areas of maximum interindividual spatial coincidence in the ER-beamformer activation patterns (see also Fig. 5). Coincidence is reported as a score of 9, the maximum if all participants' activation patterns were identical (however, the maximum score for dichotic modulation is actually limited to 8 in the right hemisphere and 6 in the left hemisphere because several participants had unilateral responses). Noninteger values indicate that the group peaks result from the summation of individual peaks that were close but not identical across participants, so values less than 1 are summed. All activations with a score of 3 or above (33% consistency) are shown. Also shown are the MNI coordinates of activation peaks from fixed-effects group images (SNPM), thresholded at 40. Superscript numbers 1-6 highlight activations that were identified by both group-imaging approaches. For details of the group analyses and rationale for using complementary approaches, see Materials and Methods. MNI, Montreal Neurological Institute.

Figure 5.

Group overlay images of the ER-beamformer peaks for the N1m- (a), P2m- (b), and dichotic-modulation (c) responses. The relative timing of the evoked response from which these activations were derived is illustrated in (d). (e) The same activations overlaid on a single slice; the activity for the N1m response is largely obscured by the P2m response; however, the response to dichotic modulation lies posterior to these, in planum temporale. The activations were individually thresholded, showing activity which is above 75% of the maximum for each response, to illustrate the peak areas of consistent activation. The absolute degrees of consistency are given in Table 2. Further details and rationale for the overlay images are given in the Methods.

Figure 5.

Group overlay images of the ER-beamformer peaks for the N1m- (a), P2m- (b), and dichotic-modulation (c) responses. The relative timing of the evoked response from which these activations were derived is illustrated in (d). (e) The same activations overlaid on a single slice; the activity for the N1m response is largely obscured by the P2m response; however, the response to dichotic modulation lies posterior to these, in planum temporale. The activations were individually thresholded, showing activity which is above 75% of the maximum for each response, to illustrate the peak areas of consistent activation. The absolute degrees of consistency are given in Table 2. Further details and rationale for the overlay images are given in the Methods.

Virtual electrode (VE) time series were then reconstructed for the peak voxels in the ER-beamformer images for each individual. Weights for the peak voxel, computed using the same time–frequency parameters as for the volumetric images (5–25 Hz, over a 200-ms time period), were multiplied by the unfiltered sensor data for the whole epoch. Thus, plots of VE data will accurately depict events at the source of activity when they fall within that time–frequency bin. They will also illustrate activity measurable at that source before and after the time window of interest, and at higher or lower frequencies, but with lower spatial accuracy because this activity may be leaking from adjacent sources. For the analysis in this study, all activity of interest fell close to or within the chosen time–frequency bin. Averaging the VE time series data across epochs provided a spatially filtered evoked response for that cortical location. Time–frequency plots were also produced for these averaged time series using Stockwell transforms (Stockwell et al. 1996). Statistical analysis was performed on the VE time series for each epoch. The amplitude of the response at the peak latency was compared across epochs for the dichotic versus the diotic condition for each participant in each hemisphere, using an independent-samples t-test.

Pitch-Matching Paradigm

In this self-paced behavioral paradigm, participants adjusted the frequency of a pure tone until its pitch matched that elicited by the dichotic modulation. Stimuli were presented through Sennheiser HD-50 earphones, wired in phase, and adjustments were made with a computer-game joystick. Matches were made separately for each dichotic stimulus, at a given noise bandwidth, center frequency, and modulation rate. The noise power density was 64 dB SPL. The modulation depth was always set at 0.3, which is above detection threshold for each participant at the slowest modulation rate, so that any pitch could be comfortably perceived. In each trial, an exemplar of the dichotic-modulated complex was followed, after a 500 ms pause, by a 500-ms pure tone. The frequency of the pure tone was first set to 1 kHz and the listener used a joystick to adjust its frequency until the pitches of the modulated noise and alternating pure tone matched. Initial adjustments were made with a minimum step of 10 Hz and a maximum step, with maximum displacement of the joystick, of about 90 Hz. Listeners were then able to reduce the adjustment scale by a factor of 10. This experimental protocol was based on similar work by Roberts and Brunstrom (2001). The listeners usually achieved a match with 20–30 stimulus-pair presentations, and the consistency of matches was similar to that which has been observed for other binaural pitches (Klein and Hartmann 1981; Hartmann and McMillon 2001).

Results

MEG Recordings

Our measurements were designed to determine whether, as reported by our observers, the percept of pitch was a binaural phenomenon produced by interaural phase changes. We also explored whether the introduction of modulation-elicited responses were different from those arising from sound onset. We expect the response for a true pitch to be different from a simple response to the change in the acoustic power distribution due to the modulation of our noise carrier.

In the dichotic condition, all listeners reported hearing a salient pitch beginning when the dichotic modulation began. In the diotic condition, listeners reported occasionally hearing a faint pitch.

Figure 3 shows the MEG global field power in each hemisphere for all participants as a function of time; this provides a useful summary indication of the time course of any evoked fields. In both the dichotic and the diotic conditions (Fig. 3a,b), the onset of the stimulus evokes 2 clear responses at about 112 and 205 ms; these are the well-described auditory N1m and P2m sound-onset responses. In the dichotic condition (Fig. 3a), there is a third response occurring at around 700 ms, evoked by the introduction of dichotic modulation that began at 500 ms; this response is more variable than the onset responses, both across participants and between hemispheres. Nevertheless, a response to the onset of dichotic modulation is visible for all listeners in either the left or the right hemisphere or both. The introduction of diotic modulation, shown in Figure 3b, did not evoke a consistent response. Table 1 shows amplitude and latency data for the P2m and the response to dichotic modulation for each participant. The P2m provides a useful functional landmark for activity located in primary auditory cortex and has a similar latency to our dichotic-modulation response. While the latency of the response to the dichotic modulation is relatively consistent, there is variability in relative amplitude between hemispheres. This is further illustrated by examination of Spearman rank correlation coefficients (rho) between amplitudes in the left and right hemispheres. For the P2m response, left and right amplitudes are moderately correlated (rho = 0.53; P = 0.15), and this rises to a very strong correlation (rho = 0.9; P = 0.0024) if participant 2, whose right-hemisphere value is an outlier, is excluded from the analysis. But for the response to dichotic modulation, left and right amplitudes are uncorrelated (rho = −0.002; P = 0.99), indicating a pattern of laterality which differs from that for the response to energy onset.

Figure 4 shows field patterns for the response to dichotic modulation and spectrograms of activity in a single sensor from the peak of the field pattern, from 2 representative participants. The field patterns show the bilateral field pattern with reversed polarity across hemispheres typical of an auditory-evoked response. The spectrograms show a burst of evoked power in the theta–beta range following the onset of the stimulus and another burst of activity following the onset of modulation in the dichotic condition. The response to the onset of dichotic modulation is reduced or absent in the diotic modulation condition for all participants. The dichotic response is variable between participants but the majority of power occurs within the theta and alpha bands. Like the evoked response amplitudes, this measure is variable across hemispheres. Peak spectral power values for the dichotic-modulation response in left and right hemispheres are shown in Table 2, column (a). These are moderately but not significantly correlated with the dichotic modulation–evoked response amplitudes from Table 1, column (d) (left hemisphere: rho = 0.52, P = 0.15; right hemisphere: rho = 0.61, P = 0.08) and again there is no significant correlation between spectral power in the left and right hemisphere (rho = 0.24, P = 0.54).

The Dichotic-Modulation Response Is Not an Energy-Onset Response

ER–beamformer source models were computed for the responses identified in Figures 3 and 4: the N1m and P2m onset responses and the response evoked by introduction of dichotic modulation. The N1m and P2m responses yielded bilateral source models in each participant, but the response to dichotic modulation was more variable—8 participants showed a response in the right temporal lobe and 6 in the left temporal lobe. Figure 5 and Table 3 show group data for the sources. In Figure 5, normalized images for each individual have been rescaled to a maximum of 1 and overlaid, to reveal areas of maximum spatial consistency across the group. Figure 5a shows peak activation in the group for the N1m response, Figure 5b shows the P2m response, and Figure 5c shows the response to dichotic modulation. Figure 5d illustrates the relative timing of the activations. The center of the area of maximum overlap for the P2m response is in Heschl's gyrus (bilateral auditory cortex), and the main part of the N1m response is localized lateral and posterior to the P2m response, consistent with the literature (Lutkenhoner and Steinstrater 1998). These sources provide a functional landmark with which the source of the dichotic-modulation response can be contrasted. The response to the onset of dichotic modulation is located posterior to the main sources of both components of the N1m–P2m complex, as can be seen in Figure 5e where all 3 activation patterns are overlaid. The overlay data in Table 3 show the consistency of the activations. When considered in the context of the number of individuals for whom a response was seen in the sensor data, spatial consistency for the pitch response is excellent—82% of activation was coincident in the right hemisphere and 83% in the left hemisphere.

These data confirm that the neurophysiological response evoked by the onset of dichotic modulation of the noise is not an energy-onset response and support the view that a new percept emerges from the modulated noise stimulus.

The Dichotic-Modulation Response Is a Binaural Phenomenon

Figure 6 shows averaged VE data for the source of the response to dichotic modulation in planum temporale in 2 individuals. Being adjacent to the peak source of the responses to energy onset, clear N1m/P2m responses can be observed in each plot even though the source is optimized for the dichotic-modulation response. These are followed by a pronounced response to the onset of dichotic modulation. Data for the diotic modulation condition are also plotted, and error bars (±1 standard deviation) shown at the peak of the response to dichotic modulation confirm a large and significant difference between the activity in the dichotic and diotic conditions at this voxel. This difference was tested statistically for each participant by computing a t-test between the amplitudes in the dichotic and diotic conditions at this source and at the latency of the dichotic-modulation response (Table 2, column (b)). The analysis effectively tests whether each individual's response was dependent on binaural processing. There was a significant difference in one or both hemispheres for all except one participant, hence these data are in agreement with listeners' reports that the percept of pitch is significantly stronger and more salient when the modulation is presented dichotically.

Figure 6.

Source-model VE time series data are shown for 2 example participants. (a,d) Averaged VE time series for source of the response to dichotic modulation in one hemisphere. In the average for the dichotic condition (red line), a large evoked response is observed at this source at approximately 700 ms, 200 ms following the onset of dichotic modulation. In the diotic condition (blue line), the evoked response is absent (d) or small (a). (b,c and e,f) Spectrograms of the same time series in the dichotic (left) and diotic (right) conditions. For each participant, there is some leakage of the response to sound onset into this source, between 0 and about 300 ms in each spectrogram. In the spectrograms for the dichotic condition, a burst of spectral power is observed at around 600 ms, with components in the theta and alpha ranges. These are absent or attenuated in the diotic condition.

Figure 6.

Source-model VE time series data are shown for 2 example participants. (a,d) Averaged VE time series for source of the response to dichotic modulation in one hemisphere. In the average for the dichotic condition (red line), a large evoked response is observed at this source at approximately 700 ms, 200 ms following the onset of dichotic modulation. In the diotic condition (blue line), the evoked response is absent (d) or small (a). (b,c and e,f) Spectrograms of the same time series in the dichotic (left) and diotic (right) conditions. For each participant, there is some leakage of the response to sound onset into this source, between 0 and about 300 ms in each spectrogram. In the spectrograms for the dichotic condition, a burst of spectral power is observed at around 600 ms, with components in the theta and alpha ranges. These are absent or attenuated in the diotic condition.

The spectrograms in Figure 6 illustrate the pattern of spectral power change at the source of the dichotic-modulation response. Again there is some leakage of the energy-onset response into this voxel (at around 100 ms, consistent with the adjacently located N1m), but a clear burst of spectral power can be seen following the onset of dichotic modulation. This is the burst of power which was localized in the ER-beamformer source model for these voxels. In the diotic modulation condition at this source, this burst of power was reduced or absent.

The Dichotic-Modulation Response Is Lateralized

Table 2 also highlights the individual differences in the perceptual laterality of the pitch, which was closely related to the spectral power of the evoked response in both sensor and source-model data. Column (a) shows the peak spectral power between 600 and 800 ms in a sensor from the peak of the dipolar field pattern evoked by dichotic modulation, for the left and right hemisphere in each subject (for examples of these spectrograms, see also Fig. 4). Expressed as a percentage of the overall power in the spectrogram, these values are normalized for the effects of left–right differences in dewar placement. There are large individual differences in laterality, which are also consistent with the source-model data in column (b): Some individuals show a source in one hemisphere only, whereas for others, the source is bilateral. Reported perceptual laterality is shown in column (c). With one exception, listeners who reported the pitch to be bilateral or on the right, had bilateral MEG responses. The exception had a left-hemisphere advantage. The listeners who heard the pitch on the left, had a right-lateralized response (although one showed no significant effect in column (b)). Thus, perceptual laterality was reflected quite consistently in the MEG data, indexed by spectral power at sensor level, by the presence or absence of a temporal lobe source, and by the significant effects of binaural hearing, all of which were lateralized to the contralateral hemisphere.

Pitch Dependence on Stimulus Characteristics

Further psychophysical testing on 2 trained listeners (authors G.B.H. and C.W.) explored the dependence of the pitch on the spectral characteristics of the stimulus. The listeners used a computer-game joystick to adjust the frequency of a pure tone until its pitch matched the pitch elicited by the dichotic modulation. Combinations of 3 center frequencies (500 Hz, 1 kHz, and 3 kHz) and 3 bandwidths (250 Hz, 125 Hz, and 63 Hz) were used, across the range of modulation rates where the listeners found that they could detect the pitch.

Figure 7 provides an overview of the pure-tone frequencies to which the dichotic-modulation stimuli were matched. The listeners were able to make matches for the 500-Hz center frequency (all bandwidths) and the 1-kHz center frequency (250 and 125 Hz bandwidths only), but none of our stimuli centered at 3 kHz. The modulation rates at which pitch matches were possible differed according to center frequency: for the 500-Hz stimuli, they ranged from 40 to 180 Hz, whereas for 1 kHz, they were between 120 and 360 Hz. Overall, where pitch matches could be made, the frequency of the matched tone decreased systematically as a function of increasing stimulus bandwidth and increasing modulation rate.

Figure 7.

Mean and standard deviation of pitch matches for two listeners indicated as filled and clear symbols. Data are shown for two carrier frequencies and three different bandwidths: 500-Hz carrier: 250-Hz bandwidth (•), 125-Hz bandwidth (▪), 63-Hz bandwidth (▴). 1-kHz carrier: 250-Hz bandwidth (♦), 125-Hz bandwidth (▸) and 63-Hz bandwidth (◂).

Figure 7.

Mean and standard deviation of pitch matches for two listeners indicated as filled and clear symbols. Data are shown for two carrier frequencies and three different bandwidths: 500-Hz carrier: 250-Hz bandwidth (•), 125-Hz bandwidth (▪), 63-Hz bandwidth (▴). 1-kHz carrier: 250-Hz bandwidth (♦), 125-Hz bandwidth (▸) and 63-Hz bandwidth (◂).

Figure 8 shows the pitch as a function of modulation frequency for a 250-Hz bandwidth noise, replotted from Figure 7. The dashed line shows the frequency of the lower boundary of the lower sideband of the modulated noise. Pitch matches lie approximately 5% above this boundary.

Figure 8.

This figure, replotted from Figure 7, shows pitch matches of the 2 different listeners indicted as filled and clear symbols for a 250 Hz noise centered on 500 Hz at different modulation rates. The dashed line shows the frequencies of the lower boundary of the lower spectral sideband produced by the phase modulation. Pitch matches are about 5% above the frequency of the boundary.

Figure 8.

This figure, replotted from Figure 7, shows pitch matches of the 2 different listeners indicted as filled and clear symbols for a 250 Hz noise centered on 500 Hz at different modulation rates. The dashed line shows the frequencies of the lower boundary of the lower spectral sideband produced by the phase modulation. Pitch matches are about 5% above the frequency of the boundary.

Participants reported that the pitch was most salient in the stimuli centered on 500 Hz and that pitch matching with the sounds centered on 1 kHz was more challenging. This observation was reflected in the consistency of responses, as indicated by the error bars in Figure 7 and the octave change in matching frequency for some modulation rates at 1 kHz. In addition, at 1 kHz, the existence region of the pitch differed between listeners in the range of bandwidths and modulation rates.

Pitch matches (not shown) were also made for the very faint pitch that was heard in the diotic condition with the 250-Hz bandwidth carrier centered on 500 Hz. The frequency of the pure-tone match was the same as for the dichotic stimulus, indicating that the pitch was the same in both conditions and that binaural cues simply increase saliency.

Discussion

This is the first report of binaural pitch being elicited by the binaural phase modulation of a noise stimulus. The occurrence of the pitch appears to account for previously unexplained high sensitivity to rapid IPD fluctuations (Grantham and Wightman 1978; Witton et al. 2005). Our data also provide the first MEG localization of binaural pitch-sensitive areas to planum temporale and provide physiological evidence for the perceptual laterality of binaural pitches.

In Grantham and Wightman's original studies using this stimulus, it is possible that listeners discriminated dichotic from diotic phase modulations by detecting the pitch associated only with the dichotic modulation. In the original study, there was no report of a dichotic pitch, but participants were listening for just-noticeable differences between diotic and dichotic modulation, so modulation depths were small and the pitch would have been subtle; the task set might not have elicited comments about the cue being used. However, the data presented here provide strong evidence that pitch can be used as a cue and that the mechanisms underpinning binaural pitch perception account for the listeners' surprisingly “nonsluggish” sensitivity to rapid interaural phase fluctuations (Witton et al. 2005).

Our MEG data confirmed neurophysiologically that the onset of the dynamic IPDs is processed in a different cortical location, and with different spectrotemporal response properties, from the onset of the noise itself. IPDs are extracted from stimuli at the brainstem level, but it is thought that cortical networks in the planum temporale encode auditory “objects” (Griffiths and Warren 2004) which may include the emergence of pitch from background noise (Chait et al. 2006; Hall and Plack 2009). Our spatial localization data are strikingly consistent with previous functional MRI (fMRI) reports of such pitch-sensitive areas (Hall and Plack 2009) providing the first MEG confirmation of this result for a binaural pitch. Hall and Plack (2009) found similar activation patterns for Huggins pitch as for other pitches such as tones in noise, the only exception being iterated rippled noise which activated lateral Heschl's gyrus, consistent with MEG data from Krumbholz et al. 2003. Other MEG studies have demonstrated that the auditory N1m and P2m are sensitive to the pitch of the stimulus which elicits them, showing a tonotopic organization (e.g., Pantev et al. 1989; Roberts and Poeppel 1996). However, the stimulus design used here separates the onset of the pitch from the onset of sound energy, and our data show that a strong component of pitch processing occurs in a cortical location which is posterior to the energy-onset responses. Our MEG time series data agree well with previous work in that the evoked magnetic field at the cortical source has similar latency characteristics to binaural pitch-onset responses reported by Hertrich et al. (2005) and Chait et al. (2006), who each contrasted neuromagnetic responses to Huggins pitch with responses to “nonbinaural” pitches.

An interesting characteristic of this, and other, binaural pitches is its perceptual laterality, which is reported despite the stimulus being binaurally symmetrical. Our MEG data confirm that the perceptual laterality is consistently reflected in a neurophysiological lateralization. Specifically, laterality was reflected in the spectral power change evoked by the onset of pitch rather than the evoked response amplitude, possibly implying a functional role for oscillations in theta to alpha range. Huggins pitch is also perceptually lateralized but previous neuroimaging studies have not explored whether the laterality is shown in individual auditory cortical activations. It has been suggested that the laterality of binaural pitches results from a more generalized perceptual asymmetry in the auditory system (Zhang and Hartmann 2008), but models of binaural interaction do not provide an explanation for this phenomenon. Furthermore, as in this study with MEG, lateralized binaural stimuli do not always elicit contralateral fMRI activations, with response laterality depending on the cortical mechanisms underlying the binaural lateralization process itself (von Kriegstein et al. 2008). It is possible that the rapidly changing IPDs in our stimulus are extracted in brainstem nuclei (Joris and Yin 2007), but as this and other binaural pitch stimuli are symmetrical, the perceptual laterality may emerge at a higher level.

Our psychophysical data show how the pitch changes as a function of mean carrier frequency, carrier bandwidth, and modulation rate. Models(Culling and Summerfield 1995), such the modified equalization–cancellation (m-EC) model (Culling and Summerfield 1995), successfully describe how other binaural pitches occur at interaural phase edges within bands of noise (Culling, Summerfield and Marshall 1998). But the notion that the pitch described here is a simple edge pitch is problematic; it occurs near a lower spectral boundary not at a phase edge within the noise. Spectral boundaries such as this do not typically produce a binaural pitch. Ultimately, the interaural phase properties of this stimulus are complex, depending not only on the modulation waveform but also on how the overlapping, random noise and sideband components sum and cancel under different stimulus conditions.

Psychophysical and neurophysiological paradigms based on binaural pitches have provided many useful insights into how the brain integrates information from both ears. Models of binaural hearing are, in the end, key to our understanding of how the typically developed and impaired auditory system extracts information from the auditory world. As well as providing new data, the novel binaural pitch illustrates a steady perceptual phenomenon that emerges from dynamic interaural phase variations.

Funding

MEG recordings in the Wellcome Trust Laboratory for MEG Studies were funded by the Dr Hadwen Trust. Structural MRI scans were supported by the Lord Dowding Fund for Humane Research. G.B.H. was supported by the Leverhulme Trust.

Conflict of Interest : None declared.

References

Adjamian
P
Barnes
GR
Hillebrand
A
Holliday
IE
Singh
KD
Furlong
PL
Harrington
E
Barclay
CW
Route
PJ
Co-registration of magnetoencephalography with magnetic resonance imaging using bite-bar-based fiducials and surface-matching
Clin Neurophysiol
 , 
2004
, vol. 
115
 (pg. 
691
-
698
)
Adjamian
P
Worthen
SF
Hillebrand
A
Furlong
PL
Chizh
BA
Hobson
AR
Aziz
Q
Barnes
GR
Effective electromagnetic noise cancellation with beamformers and synthetic gradiometry in shielded and partly shielded environments
J Neurosci Methods
 , 
2009
, vol. 
178
 (pg. 
120
-
127
)
Baillet
S
Mosher
JC
Leahy
RM
Electromagnetic brain mapping
IEEE Signal Process Mag
 , 
2001
, vol. 
18
 (pg. 
14
-
30
)
Brookes
MJ
Stevenson
CM
Barnes
GR
Hillebrand
A
Simpson
MI
Francis
ST
Morris
PG
Beamformer reconstruction of correlated sources using a modified source model
Neuroimage
 , 
2007
, vol. 
34
 (pg. 
1454
-
1465
)
Chait
M
Pöppel
D
Simon
JZ
Neural response correlates of detection of monaurally and binaurally created pitches in humans
Cereb Cortex
 , 
2006
, vol. 
16
 (pg. 
835
-
848
)
Cheyne
D
Bakhtazad
L
Gaetz
W
Spatiotemporal mapping of cortical activity accompanying voluntary movements using an event-related beamforming approach
Hum Brain Mapp
 , 
2006
, vol. 
27
 (pg. 
213
-
229
)
Cheyne
D
Bostan
AC
Gaetz
W
Pang
EW
Event-related beamforming: a robust method for presurgical functional mapping using MEG
Clin Neurophysiol
 , 
2007
, vol. 
118
 (pg. 
1691
-
1704
)
Cramer
EM
Huggins
WH
Creation of pitch through binaural interaction
J Acoust Soc Am
 , 
1958
, vol. 
30
 (pg. 
413
-
417
)
Culling
JF
Summerfield
AQ
Perceptual separation of concurrent speech sounds—absence of across-frequency grouping by common interaural delay
J Acoust Soc Am
 , 
1995
, vol. 
98
 (pg. 
785
-
797
)
Culling
JF
Summerfield
AQ
Marshall
DH
Dichotic pitches as illusions of binaural unmasking. I. Huggins' pitch and the “binaural edge pitch”
J Acoust Soc Am
 , 
1998
, vol. 
103
 (pg. 
3509
-
3526
)
Goldman
S
Frequency analysis, modulation and noise
 , 
1948
New York
McGraw-Hill
Grantham
DW
Wightman
FL
Detectability of varying inter-aural temporal differences
J Acoust Soc Am
 , 
1978
, vol. 
63
 (pg. 
511
-
523
)
Griffiths
TD
Warren
JD
What is an auditory object?
Nat Rev Neurosci
 , 
2004
, vol. 
5
 (pg. 
887
-
892
)
Hadjipapas
A
Hillebrand
A
Holliday
IE
Singh
KD
Barnes
GR
Assessing interactions of linear and nonlinear neuronal sources using MEG beamformers: a proof of concept
Clin Neurophysiol
 , 
2005
, vol. 
116
 (pg. 
1300
-
1313
)
Hall
DA
Plack
CJ
Pitch processing sites in the human auditory brain
Cereb Cortex
 , 
2009
, vol. 
19
 (pg. 
576
-
585
)
Hartmann
WM
McMillon
CD
Binaural coherence edge pitch
J Acoust Soc Am
 , 
2001
, vol. 
109
 (pg. 
294
-
305
)
Herdman
AT
Wollbrink
A
Chau
W
Ishii
R
Ross
B
Pantev
C
Determination of activation areas in the human auditory cortex by means of synthetic aperture magnetometry
Neuroimage
 , 
2003
, vol. 
20
 (pg. 
995
-
1005
)
Hertrich
I
Mathiak
K
Menning
H
Lutzenberger
W
Ackermann
H
MEG responses to rippled noise and Huggins pitch reveal similar cortical representations
Neuroreport
 , 
2005
, vol. 
16
 (pg. 
193
-
196
)
Hillebrand
A
Barnes
GR
Beamformer analysis of MEG data
Int Rev Neurobiol
 , 
2005
, vol. 
68
 (pg. 
149
-
171
)
Hillebrand
A
Singh
KD
Holliday
IE
Furlong
PL
Barnes
GR
A new approach to neuroimaging with magnetoencephalography
Hum Brain Mapp
 , 
2005
, vol. 
25
 (pg. 
199
-
211
)
Huang
MX
Mosher
JC
Leahy
RM
A sensor-weighted overlapping-sphere head model and exhaustive head model comparison for MEG
Phys Med Biol
 , 
1999
, vol. 
44
 (pg. 
423
-
440
)
Joris
P
Yin
TCT
A matter of time: internal delays in binaural processing
Trends Neurosci
 , 
2007
, vol. 
30
 (pg. 
70
-
78
)
Klein
MA
Hartmann
WM
Binaural edge pitch
J Acoust Soc Am
 , 
1981
, vol. 
70
 (pg. 
51
-
61
)
Krumbholz
K
Patterson
RD
Seither-Preisler
A
Lammertmann
C
Lutkenhoner
B
Neuromagnetic evidence for a pitch processing center in Heschl's gyrus
Cereb Cortex
 , 
2003
, vol. 
13
 (pg. 
765
-
772
)
Lehmann
D
Skrandies
W
Reference-free identification of components of checkerboard-evoked multichannel potential fields
Electroencephalogr Clin Neurophysiol
 , 
1980
, vol. 
48
 (pg. 
609
-
621
)
Lutkenhoner
B
Steinstrater
O
High-precision neuromagnetic study of the functional organization of the human auditory cortex
Audiol Neurootol
 , 
1998
, vol. 
3
 (pg. 
191
-
213
)
Naatanen
R
Picton
T
The N1 wave of the human electric and magnetic response to sound—a review and an analysis of the component structure
Psychophysiology
 , 
1987
, vol. 
24
 (pg. 
375
-
425
)
Pantev
C
Hoke
M
Lutkenhoner
B
Lehnertz
K
Tonotopic organization of the auditory cortex: pitch versus frequency representation
Science
 , 
1989
, vol. 
246
 (pg. 
486
-
488
)
Popescu
M
Popescu
E-A
Chan
C
Blunt
SD
Lewine
JD
Spatio-temporal reconstruction of bilateral auditory steady-state responses using MEG beamformers
IEEE Trans Biomed Eng
 , 
2008
, vol. 
55
 (pg. 
1092
-
1102
)
Robinson
SE
Vrba
J
Yoshimoto
T
Functional neuroimaging by synthetic aperture magnetometry (SAM)
Recent advances in biomagnetism
 , 
1999
Sendai (Japan)
Tohoku University Press
(pg. 
302
-
305
)
Roberts
B
Brunstrom
JM
Perceptual fusion and fragmentation of complex tones made inharmonic by applying different degrees of frequency shift and spectral stretch
J Acoust Soc Am
 , 
2001
, vol. 
110
 (pg. 
2479
-
2490
)
Roberts
TPL
Poeppel
D
Latency of auditory evoked M100 as a function of tone frequency
Neuroreport
 , 
1996
, vol. 
7
 (pg. 
1138
-
1140
)
Stockwell
RG
Mansinha
L
Lowe
RP
Localization of the complex spectrum: the S-Transform
IEEE Trans Signal Process
 , 
1996
, vol. 
44
 (pg. 
998
-
1001
)
von Kriegstein
K
Griffiths
TD
Thompson
SK
McAlpine
D
Responses to interaural time delay in human cortex
J Neurophysiol
 , 
2008
, vol. 
100
 (pg. 
2712
-
2718
)
Witton
C
Green
GGR
Henning
GB
Pressnitzer
D
de Cheveigne
A
McAdams
S
Collet
L
Binaural “sluggishness” as a function of stimulus bandwidth
Auditory signal processing: physiology, psychophysics, and models
 , 
2005
Berlin (Germany)
Springer
(pg. 
443
-
453
)
Zhang
PX
Hartmann
WM
Lateralization of Huggins pitch
J Acoust Soc Am
 , 
2008
, vol. 
124
 (pg. 
3873
-
3887
)