Cortical synchronization at γ-frequencies (35–90 Hz) has been proposed to define the connectedness among the local parts of a perceived visual object. This hypothesis is still under debate. We tested it under conditions of binocular rivalry (BR), where a monkey perceived alternations among conflicting gratings presented singly to each eye at orthogonal orientations. We made multi-channel microelectrode recordings of multi-unit activity (MUA) and local field potentials (LFP) from striate cortex (V1) during BR while the monkey indicated his perception by pushing a lever. We analyzed spectral power and coherence of MUA and LFP over 4–90 Hz. As in previous work, coherence of γ-signals in most pairs of recording locations strongly depended on grating orientation when stimuli were presented congruently in both eyes. With incongruent (rivalrous) stimulation LFP power was often consistently modulated in consonance with the perceptual state. This was not visible in MUA. These perception-related modulations of LFP occurred at low and medium frequencies (<30 Hz), but not at γ-frequencies. Perception-related modulations of LFP coherence were also restricted to the low–medium range. In conclusion, our results do not support the expectation that γ-synchronization in V1 is related to the perceptual state during BR, but instead suggest a perception-related role of synchrony at low and medium frequencies.
Synchronization at γ-frequencies (35–90 Hz) can occur in the visual cortex among neurons activated by a visual object. This synchrony has been assumed to label the distributed features of the object for belonging perceptually together. However, this hypothesis is still under debate. Tests whether perception of a visual object is really paralleled by γ-synchrony may well be analyzed in tasks where the visual stimulus is identical while the percept has two alternatives. This is the case in binocular rivalry (BR), the phenomenon of alternating perception of two incongruent visual stimulus patterns presented dichoptically to both eyes.
There are ongoing controversies about the nature of BR — whether it is caused by interocular competition (Lehky, 1988; Blake, 1989; Lee and Blake, 1999) or competition of coherent percepts [Diaz-Caneja (1928), translated in Alais et al. (2000) and replicated in Ngo et al. (2000); Kovács et al., 1996; Logothetis et al., 1996]. At the neurophysiologic level, these alternatives read as competition between monocular channels at early stages on the one hand (Blake, 1989), or competition of high level cortical representations on the other hand (Logothetis, 1998). The fact that the percentage of single cells which correlate with the monkey’s perception during BR increases along the ventral pathway is taken as evidence for the high-level explanation (Leopold and Logothetis, 1996; Sheinberg and Logothetis, 1997; Logothetis, 1998). Modulation of monocular channels in primary visual cortex due to alternating perception has been reported with fMRI in human (Tong and Engel, 2001), supporting the low-level model. Corresponding results are also discussed in the context of visual awareness and its neuronal correlates. Especially the role of primary visual cortex in containing signals correlated with visual awareness is highly controversial (Logothetis, 1998; Engel et al., 1999; Andrews, 2001; Tong and Engel, 2001). More recently the understanding gains acceptance, that computational resources along the whole visual pathway are involved in the task of resolving perceptual ambiguities, BR being only one among others, including bistable figures of any kind (Blake and Logothetis, 2002). Not only the locus but also the quality of a potential awareness-relevant neuronal signal remains unclear. For primary visual cortex alone, argumentation is based on such different measures as mean spike rate of single cells in awake monkey (Leopold and Logothetis, 1996), multi-unit spike correlation and spike-to-field-potential coherence in strabismic cat (Fries et al., 1997, 2002) and BOLD signal in fMRI of the human blind spot region (Tong and Engel, 2001).
Synchronized γ-oscillations observed in primary visual area of cat (Eckhorn et al., 1988; Gray et al., 1989) and monkey (Kreiter and Singer, 1992; Eckhorn et al., 1993a) have often been associated with feature binding (reviewed in Eckhorn, 1999; Engel et al., 1999; Gray, 1999) and therefore should also play a role in perceptual segregation and integration of those features belonging to the currently perceived object. In the context of BR the synchronization hypothesis has so far only been investigated intra-cranially in preliminary studies in awake monkeys in our group (Kottmann et al., 1996) and in strabismic cats (Fries et al., 1997), both suggesting a key role of γ-synchronization in awareness-relevant signaling. We tested this hypothesis under conditions of BR, where a monkey perceived alternations among conflicting gratings presented dichoptically at orthogonal orientations. The two alternative percepts occur despite identical physical stimulation. This allows dissociating modulations of the neuronal activity, e.g. synchronization in the γ-range or at other frequencies, due to the perceptual state from those due to changing stimulation. We performed multi-channel microelectrode recordings from monkey primary visual cortex (V1) during BR and analyzed multi-unit activity and local field potentials over a broad frequency range to figure out the role of synchronization in V1 for visual awareness during BR. [An abstract of this work has been published recently (Gail et al., 2001).]
Materials and Methods
Visual Stimulation Setup
Dichoptical visual stimulation was realized with a three-screen setup suitable for humans and monkeys (Fig. 1A). The three screens were synchronized and made congruent in view by means of two semi-transparent mirrors. A two step calibration procedure ensured positional alignment of the screens with an accuracy of ∼0.025° when using the setup with monkeys (Gail et al., 2003). The three screens were adjusted in luminance across the entire grayscale range.
To induce BR we used stationary, soft-edge, high-contrast, sinusoidal luminance grating patches of horizontal and vertical orientation, respectively (Fig. 1). They were dichoptically presented at corresponding retinal positions of the left and right eye while the monkey had to keep fixation within ±0.45° at a small spot on the central screen. In each recording session the patches were adjusted in size and position to cover the classical receptive fields of all recording locations within the patches’ area of full contrast. At the same time patches were made as small as possible to maximize the probability of exclusive dominance of one of the patches across its entire extension (O’Shea et al., 1997). Figure 1C shows the typical arrangement of stimulus and classical receptive fields.
To control the monkey’s behavior we used a high portion of catch trials and permanently compared the psychometric performance with that of human observers recorded in the same setup. We used trials composed of different stimuli and with different time courses described in the following.
Three different types of trials occurred in pseudo-random order: During incongruent trials (∼75%), orthogonal grating patches with variable contrast (see below) were presented to the left and right eye, reliably evoking BR. Congruent trials (∼15%) consisted of identical, full-contrast patches presented to both eyes; they always become perceptually fused to a uniform grating patch. Piecemeal trials (∼10%) consisted of two different patches presented to the left and right eye; each patch was a mixture of horizontal and vertical grating components (Fig. 1D). The left and right patches were not complementary and, therefore, in no way could be fused to the percept of a grating with uniform orientation. This condition always induced a piecemeal percept. The three trial types were not explicitly indicated to the monkey as being different, i.e. the monkey did not know about the current type of trial. In any case the monkey had to report whether he perceived a horizontal (lever up), a vertical (down) or a piecemeal patch (lever release). The incongruent trials were used to induce BR and served as test condition; the congruent and piecemeal trials were used as catch trials and served as reference conditions.
For the incongruent trials we pseudo-randomly varied the contrast difference between the horizontal and the vertical stimulus from trial to trial. Relative contrast ranged from –100% (vertical grating only) to +100% (horizontal grating only). This range was covered in discrete steps (seven for monkeys, nine for humans) symmetrically around the condition of equal physical contrast, which was also the condition of perceptually balanced contrast for both monkeys (equal probability of reporting). Each contrast difference was presented with equal probability. Thus, the condition with balanced contrast comprised only 1/7 = 14% of all incongruent trials for the monkeys. Only trials with balanced contrast were used for evaluation of perception-related modulations (see below). Horizontal and vertical stimuli were pseudo-randomly presented to either eye.
The three trial types differed in their time courses and reward schemes (Fig. 2). We trained the monkeys to report their first percept of one of the two rivaling stimuli as soon as it was exclusively dominant. When rivaling stimuli are switched on, exclusive perceptual dominance is not immediately established, but instead the two incongruent stimuli appear somehow fused, i.e. ‘piecemeal’-like (Wolfe, 1983). On the other hand, when presented continuously, rivaling stimuli evoke perceptual switching with temporal characteristics depending on stimulus properties (e.g. O’Shea et al., 1994). Hence, trials had to be long enough to allow one of the two stimuli to become perceptually dominant, but short enough to avoid perceptual switching towards the other stimulus. Furthermore, we encouraged the monkeys to report persistent non-appearance of exclusive dominance by means of the third response alternative (‘piecemeal’ report), in order to avoid reports on a certain grating orientation despite an actually mixed percept. Since perception in the rivaling incongruent condition was expected to be always piecemeal-like initially, the piecemeal reports were only accepted after a 1200 ms delay. This limit was chosen based on psychophysical data of humans and monkeys under the given stimulation conditions in our setup. We inserted an initial interval of piecemeal stimulation in the congruent trials (Fig. 2B) in order to make these catch trials look more similar to the incongruent trials. The monkey was thereby taught not to report piecemeal immediately, but to wait whether his percept would take on a unique orientation or not.
Correct performance of the monkey was controlled in two ways. Firstly, in all non-ambiguous trials with predictable perception (congruent and piecemeal catch trials (∼25%), plus incongruent trials with ‘monocular’ stimulation, i.e. contrast difference of 100% (2/7 of 75% ≈ 20%)) only the corresponding response alternative was rewarded. Taken together, ∼45% of the trials were non-ambiguous and therefore suitable to directly supervise the monkey’s behavior. Secondly, the psychometric data obtained from all incongruent trials served as day-by-day probabilistic behavioral control during training and recording. The data had to fit the respective human data collected with the same setup. We compared human and monkey data with respect to reaction times, probabilities of piecemeal percepts and probabilities of perceptual dominance of one of the two stimuli depending on the relative contrast difference between the two stimuli.
Two male rhesus monkeys (Macaca mulatta) aged 9 (monkey H) and 14 years (monkey S) participated in the experiments. All procedures were carried out in accordance with German laws of animal maintenance and experimentation and the guidelines published in the NIH Guide for the Care and Use of Laboratory Animals (NIH Publication No. 86-23, revised 1987). After intensive training and shortly before the experimental sessions a plastic chamber (10 mm o.d.) was implanted under deep barbiturate anesthesia to give access to visual area V1 through the intact dura. In both monkeys the chamber was implanted close to the lunate sulcus to obtain parafoveal receptive fields. Three stainless steel head posts ensured painless head fixation during recording sessions. Head posts had been implanted years before.
In each session up to 16 quartz-isolated, platinum-tungsten fiber-microelectrodes (≥1 MΩ at 1 kHz) were individually advanced into the cortex under acoustical and optical control of the recorded signals (Eckhorn and Thomas, 1993). After detecting the first reliable spikes they were slowly driven another 150–250 µm. Daily cleaning the dura before and more intensively after each session reduced thickening of the dura (and hence dimpling during electrode insertion) to a minimum, ensuring recordings from layers 2 and 3. The electrodes were arranged in a regular 4 × 4 array (750 µm pitch). From each raw broad-band signal (1–10 kHz) we separated multiple unit activity (MUA) by band-passing (1–10 kHz; 18 db/oct), full-wave rectifying and subsequent low-pass filtering (140 Hz; 18 db/oct), yielding an amplitude-weighted measure of population spike density near the electrode tip without rejecting low amplitude spikes. The mean MUA amplitude during prestimulus recording (blank gray screen with fixation spot) was subtracted from the following response epochs. Second, local field potentials (LFP) were obtained by band-passing from (1–120 Hz, 18 db/oct). Both analog signals (MUA, LFP) were sampled at a rate of 500 Hz.
Sliding-windows (128 ms epoch length; 32 ms shifts) were used to calculate time-resolved spectral power at single site recordings and coherence between signals at pairs of sites. For direct comparability, the same windows were used for the single-channel analyses of MUA amplitude. Spectra were calculated via fast Fourier transform after applying a Hamming window to the mean-free signal of each epoch. Spectra were averaged across trials with identical conditions. Paired coherence between signals of two different electrodes n and m were calculated using Bartlett-smoothing across N trials with identical conditions (Glaser and Ruchkin, 1976):
where S is the complex Fourier spectrum of the signal epoch centered on time t (asterisk denotes the complex conjugate); i is the number of the trial. The expected bias, i.e. the random coherence of the estimate depending on the number of trials, was subtracted (Benignus, 1969:
In short, coherence is a sensitive measure to estimate the linear correlation between two signals independently at each frequency. To get a high coherence value at a given frequency a high co-variation of the spectral amplitude and a constant phase difference (not necessarily zero) at this frequency across trials is needed.
For each measure signal epochs were either aligned with respect to stimulus on-set or to the point in time of the monkey’s behavioral response to average data across trials. We subdivided the frequency range in three sections approximating the classical theta/alpha-, beta- and gamma-ranges: low (4–12 Hz), medium (12–27 Hz) and high (28–90 Hz).
Differences in spectral power between two conditions were tested with Student’s t-test (α = 0.05). The number of trials defines the sample size N. Differences in spectral coherence between two conditions were tested based on the overlap of their 95% confidence intervals. To calculate confidence intervals, firstly the variance was estimated from the coherence value and the number N of contributing epochs (Glaser and Ruchkin, 1976):
The 95% confidence interval is then given by
where FZT is Fisher’s Z transform
Coherences were considered different if their confidence bands did overlap <30%. This threshold percentage was empirically determined. It corresponds to the percentage overlap of the 95% confidence intervals of two normal distributions when shifted just so far that a t-test becomes significant at the 0.05% level.
Receptive Field Characterization
The binocular and left and right eye monocular classical receptive field (CRF) positions were determined simultaneously with a newly developed dichoptical mapping technique (Gail et al., 2003) based on a sparse noise reverse-correlation method (Eckhorn et al., 1993b). Applied with independent stimulus sequences simultaneously on all three screens it provides position, temporal dynamics and ocular dominance of all CRFs of a session within 100–200 s recording time. Pseudo-online evaluation of these data allowed precise calibration of the dichoptical setup and adjustment of the visual stimulus. CRFs were lying at parafoveal locations and close to or around the vertical meridian in both monkeys. Horizontal eccentricity: 0.4° (contra) to 0.2° (ipsi) in monkey S; 0.7° (contra) to 0.2° (ipsi) in monkey H. Vertical eccentricity (lower hemifield): 0.2 – 0.7° (S), 0.9 – 1.8° (H).
where aipsi/contra denotes the neural response amplitude to ipsi-/contra-lateral stimulation within the receptive field. The ODI was determined with MUA and LFP signals. Either MUA- or LFP-ODI was taken into account depending on the signal type (MUA or LFP) currently being in consideration. A monocularity index (MI) was defined to quantify deviation from the binocular equilibrium (LeVay and Voigt, 1988):
This index ranges from 0 (equal responses for both eyes) to 1 (response only for one eye).
The main rationale of our study is to decide whether, and if so what, aspects of V1 activity are correlated with perception. The procedure was to find recording sites that become modulated by different stimulus orientations in the congruent condition with respect to a certain measure, e.g. multi-unit spike rate or LFP power. Unable to decide whether such a modulation during congruent stimulation is due to the different visual stimuli or due to the accompanying different percepts, we compared it to the modulation of the same measure by different percepts in the incongruent condition, where the percept (but not the stimulus) changed from trial to trial. Note that for this analysis only trials with balanced stimulus contrast between left and right were evaluated. In case that the modulation was present consistently in both conditions, we call it perception-related. In case that it is only present in the congruent condition, we call it stimulus-related. Significance of the differences between vertical- and horizontal-report trials is tested separately in each condition (congruent and incongruent) and for each frequency bin (7.8 Hz resolution). Within the medium- and high-frequency ranges (see above) the alpha criterion is conservatively corrected for multiple testing (Bonferoni correction) by dividing by the number of frequency bins within this range (low range = 1 bin). It is then looked whether there is a significant difference anywhere within this range. For modulations to be considered perception-related, they have to show up in both conditions (congruent and incongruent), in the same direction (e.g. stronger for horizontal stimulus and percept), and at the same frequency (not only the same frequency range). Modulations (M) for each condition are quantified by an index,
where Rpref/nonpref represents the response to the preferred/non-preferred stimulus orientation (determined in the congruent condition). ‘Response’ here refers either to multi-unit spike rate (MUA amplitude) or MUA/LFP power or coherence in one of the above pre-defined frequency ranges.
General performance, i.e. the percentage of valid trials (with appropriate ocular fixation and behavioral response timing, irrespective of the decision), was high in both monkeys (S, 85.6%; H, 90.6%). Four observations indicate that the monkeys understood and correctly performed the task, as follows. (i) The monkeys reported correctly in the randomly interspersed catch trials (see Materials and Methods). During non-ambiguous catch trial stimulation (left–right congruence or contrast difference 100%) decisions had to fit the stimulus to be considered correct. The average percentages of correct decisions from all valid non-ambiguous catch trials were 94.2% (S: n = 9 sessions) and 98.9% (H: n = 11). (ii) The monkeys’ psychometric functions were nearly identical to those of humans (Fig. 3A). The probability of perceiving the horizontal stimulus, depending on the contrast difference between the horizontal and the vertical stimulus, shows in both monkeys nearly the same sigmoidal run as the average curve of seven human observers using the identical stimulation setup and stimulus properties. (iii) Response delays in the incongruent conditions are largest around zero contrast difference. This is the case in both monkeys and in humans (Fig. 3C). With incongruent stimulation at balanced contrast behavioral responses were on average made 167 ms (S) and 98 ms (H) later than with non-ambiguous monocular or congruent stimulation. (iv) The probability of piecemeal reports in the incongruent conditions peaks at zero contrast difference. This comes up to expectation equivalently to the larger response delays. Maximum rates of piecemeal reports in the incongruent condition were 5.0% (S) and 11.2% (H), respectively (Fig. 3B). Note that for human observers the average reported piecemeal probability was much higher (up to 50%), but with large inter-individual differences.
We recorded data at 135 (monkey S) and 165 (H) recording sites during 9 (S) and 11 (H) sessions with a sufficient number of trials and reliable psychophysical performance. Only sites without technical artifacts and with clear MUA receptive fields were further analyzed (S, 119; H, 160). For calculations of inter-electrode coherence remained 734 (S) and 1086 (H) electrode pairs, respectively. For the selected channels any individual trial was rejected containing a signal artifact in LFP according to visual inspection.
Congruent versus Incongruent Stimulation
The average spectro-temporal properties of the V1 activities are different between the two monkeys, and different between congruent and incongruent stimulation (Fig. 4). LFPs of monkey H show a prominent stimulus-induced power increase between 40 and 60 Hz in the congruent condition after decay of the stimulus-onset transient. This γ-frequency sidelobe is weaker with incongruent stimulation. Power is also slightly weaker in the low- and medium-frequency range in the incongruent compared to the congruent condition (Fig. 4B, right). In monkey S differences between the two conditions are small and mainly affect the low-frequency range. Note that even though spectral power on average peaks at low frequencies, relative enhancement compared to prestimulus baseline is broad-band, mainly across the medium and high-frequency range during the sustained epoch of the responses (>200 ms). The corresponding MUA spectra look similar but with less pronounced low-frequency power (data not shown). Paired inter-electrode coherence of LFP shows no apparent difference at low and medium frequencies and a weak difference at high frequencies between the two conditions in both monkeys (Fig. 4C). With the inter-electrode spacing used (0.75–3.2 mm) corresponding MUA coherences are mostly too low to be considered adequately. Differences in spike density (MUA amplitude, Fig. 4A) between the two conditions are restricted to effects resulting from the slightly distinct stimulation protocol, i.e. the inserted stimulus switch in the congruent compared to the incongruent condition (see Materials and Methods and Fig. 2A,B).
Perception-related LFP Power
LFP power showed the most prominent results of all considered measures. Many recording sites showed perception-related modulations in LFP power around the time of the monkey’s decision. Figure 5 gives a typical example of consistent modulations in the congruent and incongruent conditions, respectively. The modulations are consistent in the sense that higher power results in both conditions from perception of the same orientation, here horizontal. The difference in spectral amplitude does not equally affect the entire frequency range. In the congruent condition power during the percept of the horizontal grating is enlarged compared to vertical below 30 Hz, but not above. In the incongruent condition the power increase is confined to the range 10–40 Hz with a maximum around 20 Hz. The given example underlines the distinctiveness between different frequency ranges. Only the medium-frequency range (12–27 Hz) fulfills the criterion for perception-related modulations at this recording site (see Materials and Methods). The low-frequency range (4–12 Hz) is stimulus-related, but not perception-related. The high-frequency range (28–90 Hz) shows no significant preference in this example.
To demonstrate that Figure 5 shows not an accidental example we systematically analyzed population data. In the population data perception-related modulations are more common in the LFP low-frequency range than in other frequency ranges or in MUA. Figure 6 contrasts modulation indices of MUA amplitude (Fig. 6A, upper panels) and LFP low-frequency power (Fig. 6A, lower) for all recording sites of monkey S in the incongruent and congruent condition for different time epochs. This comparison reveals clear distinctiveness between MUA amplitude and LFP low-frequency power with respect to perception-related modulations: While modulation indices for low-frequency LFP show increasing numbers of perception-related channels towards the time of decision, MUA indices are widely scattered without discernible temporal development. At the time of decision the modulation indices in the incongruent (rivalrous) condition are broadly and symmetrically distributed around zero in MUA, while many low-frequency LFP indices are shifted towards the positive bisection line, indicating consistent modulations in both conditions and hence perception-related modulations. Note that at the time of decision there is no case of LFP low-frequency power significantly modulated in the incongruent condition, but in the opposite direction of the modulation in the congruent condition (anti-modulation). For monkey S also LFP γ-frequency power shows a tendency to perception-related modulations, but tests based on single channel data revealed no significant results. Besides, in monkey H the γ-frequency power does not show this tendency for perception-related modulations, while modulations of MUA amplitude and LFP low-frequency power look similar in both monkeys (single channel data not shown).
For quantitative evaluation of the population data, only those recording sites are taken into account that showed orientation specific modulation in the congruent condition for the measure in consideration. Population data is quantified in two ways. First, the number of single recording sites significantly and consistently modulated in both conditions is counted (Figures 6 and 7A). Secondly, for some measures there is a tendency towards consistent modulations in the congruent and incongruent conditions although not reaching significance in the single channels. We therefore calculated for all recording sites that were significantly modulated in the congruent condition the average ratio of the modulation indices in these two conditions (co-modulation ratio: CMR = 1/NΣMincongruent/Mcongruent). Being greater than zero, this CMR value indicates the tendency towards perception-related modulations in the population data (Fig. 7B, see below for details).
Dynamics of Perception-related Modulations
The number of perception-related recording sites in LFP low-frequency power increases towards the time of decision (Fig. 6A, lower). At around 128 ms prior to decision the modulation indices in the incongruent condition essentially scatter around zero. Many channels are significantly modulated in the congruent condition at this time. However, at the time of decision the number of perception-related channels reaches 34 in this monkey, while no anti-modulated channels were found. The corresponding data for multi-unit spike rate (MUA amplitude) does not show this temporal development: The numbers of perception-related and anti-modulated channels stay balanced (Fig. 6A, upper). Figure 6B summarizes the temporal development of all used measures quantitatively. The time courses are similar for all measures showing perception-related modulations. The asymmetry in favor of perception-related modulations starts ∼100 ms before the behavioral response of the monkeys.
Comparison of Different Measures
Figure 7 summarizes the results for all measures in consideration by means of the maximal percentage of significantly modulated (consistently and oppositely) channels (Fig. 7A) and the maximal co-modulation ratio (Fig. 7B), respectively. The maximum is taken from the last six evaluation epochs before behavioral decision (Fig. 6B). Both monkeys overall show consistent results.
LFP power (Fig. 7A,B, columns 2–4) reveals a graded effect for the different frequency ranges, with perception-related modulations predominantly found at low frequencies and moderately present at medium frequencies. With respect to significant single channel data the high-frequency range is not significantly perception-related. In contrast, the CMR expresses a small asymmetry in favor of perception-related modulations, but only in monkey S. Note that in both monkeys there are many recording sites modulated in the high-frequency range in the congruent condition, but not in the incongruent one. Especially in monkey H, which showed a substantial sidelobe in the γ-range in the overall average (Fig. 4B, right), there are much more channels in the high- than in the low- or medium-frequency range solely modulated in the congruent condition, i.e. stimulus-related channels according to our definition.
LFP coherence (Fig. 7A,B, columns 5–7) shows in monkey S similar results as LFP power in this monkey. Especially the low-frequency coherence often is perception-related. Medium and high frequencies show moderate effects. Low-frequency coherence reveals the only major difference between the monkeys, given by the fact that in monkey H there are many (50/274) anti-modulated cases, hence pulling the CMR below 0.2. Therefore, for monkey H coherence and power differ in frequency specificity: Coherence is only in the medium-frequency range moderately perception-related.
MUA amplitude reveals in both monkeys several instances of perception-related modulations in the sense of the above given definition: P = 17% (13/76) in monkey S, P = 12% (13/105) in monkey H (Fig. 7A, first column). But in contrast to LFP power, here the numbers of anti-modulated recording sites are about the same size: S, 11% (8/76); H, 23% (24/105). Accordingly, the mean ratio of modulations in incongruent versus congruent conditions (CMR) is nearly zero (Fig. 7B, first column).
Spectral decomposition of the MUA signal did not yield any frequency specific clustering of perception-related modulations either (data not shown). MUA spectral coherences for most electrode pairs did not exceed the bias to be expected (cf. Materials and Methods) and hence were not further analyzed.
No Difference in Ocular Dominance of Perception-related Recording Sites
The ocular dominance of those recording sites revealing perception-related modulations do not differ in strength (by means of the monocularity index MI; see Materials and Methods) from the whole sample in both monkeys (rank-sum test, P > 0.1). There is no difference between the subset of perception-related and those solely modulated in the congruent condition (stimulus-related) either (P > 0.1).
One might argue that ocularity of the perception-related channels in our study is not the critical measure to judge the relevance of inter-ocular rivalry. Here, perception-related modulations are qualified by means of selectiveness for perceived orientation. A hypothetical monocular channel with arbitrary orientation selectivity, being switched on and off according to the current eye dominance or suppression (but not according to dominance/suppression of the channel’s preferred orientation), would not show any modulation here, since stimulus orientation was randomly interchanged between left and right from trial to trial. Therefore, we did the whole analysis in an alternative way: we used monocular stimulation as the reference condition in this case, i.e. the modulation index for the ‘congruent’ condition was calculated between left and right monocular stimulation, irrespective of the grating orientation. We then sorted all incongruent trials with respect to the side of the perceived stimulus (‘decision for right’ versus ‘decision for left’). This leads to psychometric performance curves in complete analogy to Figure 3, but this time depending on the stimulus contrast between right and left (instead of horizontal and vertical). To check for perception-related modulations depending on the side of the perceptually dominant stimulus, we again only compared trials with equal left/right contrast. This procedure did not yield a population tendency in favor of perception-related ocular selectivity in any of the used measures (data not shown).
The main result of this study is the demonstration of perception-related modulations of LFP power at single recording sites and of coherence among pairs of recording sites in V1 at low and medium signal frequencies. Such modulations occurred with respect to orientation, but not with respect to ocular preference. Against our expectation, perception-related modulations of signal coherence were weak or absent at γ-frequencies.
With BR we took great care in designing the task and conducting the training to ensure that the monkeys’ reports reliably reflected perception. We are convinced that both monkeys experienced BR and correctly reported their percepts because of four reasons, as follows. (i) The monkeys reported correctly in the catch trials (see Results). The catch trials were designed to mimic perception during BR trials (see Materials and Methods). They comprised ∼25% of all trials. In addition to the catch trials, there were incongruent trials with 100% difference among contrast in left and right eye stimuli, also being perceptually non-ambiguous. Hence the monkeys’ perception was predictable in ∼45% of the trials and performance in these trials was 94–99% correct. (ii) The monkeys’ psychometric functions are nearly identical to those of humans tested in the same setup. Both indicated horizontal and vertical with continuously, sigmoidally changing probabilities depending on contrast difference between horizontal and vertical stimulus (Fig. 3A). Even at 50/50 contrast, the psychometric functions of the monkeys were smooth although reward was random. As BR is still functional with imbalanced contrast (e.g. Blake, 1977; Leopold and Logothetis, 1996), the smooth transition in the probability of seeing either stimulus can hardly result from any other strategy than honestly reporting the percept. For example, with a horizontal–vertical contrast difference of –0.05 the probability of reporting horizontal was 17% although only reports of vertical were rewarded due to the higher contrast of the vertical stimulus in this condition (Fig. 3A, monkey H). The monkey’s error rate in this case was below 2% (i.e. the probability of reporting horizontal with binocular or monocular unambiguous vertical stimuli). The 17% probability of horizontal reports (being equal to the data of human observers) is explained best by occasional percepts of the horizontal stimulus occurring due to rivalry and being honestly reported. (iii) Reaction times with the incongruent stimuli (zero contrast difference) are longer, both in our measurements with monkeys and humans. When they are flashed for <150 ms, they appear fused in a plaid like manner to most human observers (personal observation; Wolfe, 1983). This means, there is no exclusive dominance directly after stimulus onset. Therefore, one should expect later reports of the stimulus orientation in the strongly rivaling compared to the non-rivaling conditions. This is the reason why we introduced a piecemeal epoch at the beginning of the congruent catch trials (see Materials and Methods). (iv) The probability of piecemeal reports peaks at zero contrast difference. This fits expectancy equivalently to the slowed reaction times and was also observed in our human data.
Taken together, these results indicate that the monkeys followed the intended strategy thoroughly. It remains unclear however, how strict the monkeys’ criterion for exclusive dominance was. The very low probabilities of piecemeal reports compared to our human subjects (Fig. 3B) may indicate a less strict criterion and hence a tendency to indicate a certain stimulus orientation before rivalry was fully resolved towards this orientation. Such a behavior may have been induced by the training protocol. Since during the training period one monkey adopted the strategy to report piecemeal in all rivalrous trials, we reduced the water reward for piecemeal reports to half the amount (compared to other correct trials). The reduced reward, together with the need to wait until the end of the trial in case of piecemeal reports (see Materials and Methods), discouraged monkeys from piecemeal reports, and most probably decreased the respective probabilities to a minimum.
Congruent versus Incongruent Stimulation and Interindividual Differences
Grand average data in Figure 4 showed differences in spectral distributions of LFP power and coherence both between monkeys and between stimulation conditions. Differences mainly affect a γ-band sidelobe which was present only in monkey H and stronger with congruent than with incongruent stimulation. With respect to perception-related modulations, both monkeys in general showed very similar results (Fig. 7). However, in contrast to monkey H, there was a tendency in monkey S for perception-related modulations also in the γ-band, not only in the low-to-medium frequency range. Even in monkey S this tendency is only visible in the overall co-modulation ratio (Fig. 7B) and is almost never significant in the statistical tests for signal power at single recording sites (1 of 41) or for coherence between pairs of recording sites (4 of 210; Fig. 7A). Because of these differing results in monkeys H and S, and between measures in monkey S, we do not consider γ-activity perception-related in the present study.
Differences between congruent and incongruent stimulation conditions are only obvious in monkey H, but not in monkey S. This may be put down to the fact that the differences are mainly present in reduced γ-band activity during incongruent stimulation and γ-activity is generally much less pronounced in monkey S than in monkey H. Whether the different spectral compositions between congruent and incongruent conditions are due to the difference in stereo-correspondence, or to differences in the task demand (e.g. higher attentional allocation during incongruent stimulation), is not clear. For the analysis of perception-related modulations this difference is not directly relevant, since modulations are calculated as orientation-specific contrasts separately within each condition. A further difference between conditions is induced by the transient stimulus change early in the congruent trials (Fig. 2 and Materials and Methods). Due to the response-triggered analysis, and since the modulatory effects occur with high latency, this difference in the stimulation protocol should not affect our results.
Different MUA and LFP Results
While LFP generally tended to be modulated in consonance with perceptual state, and hence reflected the rivalry induced percepts, this was not the case for MUA. The LFP effects are prominent in the low-to-medium frequency range (4–28 Hz). In contrast, significant modulations of MUA during rivalry were only present at few recording sites, and the number of co- and anti-modulations was balanced. In addition, the MUA modulations at these sites did not increase in number or strength in advance of the behavioral decision, as is the case for LFP (Fig. 6). The few channels that were significantly modulated in MUA could therefore represent incidental significances due to an arbitrary threshold criterion inherent to statistical testing. In contrast, the simultaneously recorded LFP modulations show a marked asymmetry in favor of perception-related modulations. This observation is supported by the bias towards perception-related modulations in the grand average of the co-modulation ratios (Fig. 7B), which are independent of statistical tests based on single channel data. The difference between MUA and LFP data is clearly visible in this measure as well.
What might be the reason for LFP data reflecting the perceptual state while MUA does not capture it? Taking the perception-related components as ‘signal’ and all components statistically independent from it as ‘noise’, LFP apparently has a better signal-to-noise ratio (S/N) than MUA under the conditions in our study (see also Gail et al., 2000). The differences between S/N in LFP and MUA are probably due to their different origins. While LFP reflects the neural inputs near the electrode tip (superimposed somato-dendritic potentials), including subthreshold components, MUA comprises the superimposed spike output of a much smaller population near the same tip (Legatt et al., 1980; Mitzdorf, 1987). As we found perception-related modulations only in the LFP, we assume that the highly correlated and to a large part subthreshold postsynaptic potentials in local assemblies (Lampl et al., 1999) to a considerable degree contain perception-related signal components. Hence, the sensitivity of LFP to subthreshold modulations may gain the S/N advantage over MUA. This argument does not explain the origin of such perception-related, subthreshold modulations in the supragranular layers of V1, nor its physiological relevance. We will discuss potential origins below and argue for a feedback effect from other cortical areas. Additionally, the integration of LFP over a much larger number of neurons with similar receptive field properties may also contribute to its better S/N. Even though LFP is composed of signals from neurons with largely different orientation preferences, the main contribution is from cells within a radius of ∼300 µm having similar orientation tuning (e.g. Bartfeld and Grinvald, 1992). Note that this is not in contradiction to the less sharp orientation tuning of LFP compared to MUA, since the different tuning widths can be explained by the broad orientation tuning in intracellular recordings compared to their spike outputs (Carandini and Ferster, 2000).
Increased synchronization in the γ-range among local cell populations, representing features of one of the rivaling objects, has been proposed to represent perceptual dominance of this object during rivalry (Kottmann et al., 1996). Observation of increased γ-synchrony associated with perceptual dominance in a rivalry task with strabismic cats has been taken as support for this idea (Fries et al., 1997, 2002), although it is questionable whether this result can be transferred to normally raised animals with an intact visual system. Perception-related modulation of MEG power during rivalry has also been interpreted in favor of the synchronization hypothesis, since power in the MEG signal is associated with the degree of synchronization in local cell populations (Tononi et al., 1998). Corresponding arguments were used for presumed synchronization of large cell populations that are distributed in different cortical areas, based on perception-related large-scale MEG coherence modulations during rivalry (Srinivasan et al., 1999). Due to the method of frequency-tagging, these MEG studies, however, can give no answer on the frequency-specificity of the observed modulations, since only the power at the tag frequency is analyzed.
However, in our present investigation we did not reliably find significant perception-related modulations being specific for γ-frequencies. Neither power at single electrodes nor inter-electrode coherence in the γ-band showed significant dependence on perception. In the congruent stimulation condition, however, stimulus-dependent modulations of γ-frequency power and coherence did occur, like they did in many previous studies (e.g. Frien and Eckhorn, 2000; Frien et al., 2000; for reviews, see Eckhorn, 1999; Gray, 1999; Singer, 1999). This means that in our present investigation in area V1 γ-frequency power and coherence better reflect the stimulus properties than the perceptual state. This seems to contrast another study from our laboratory, in which synchronized γ-activity appeared with the correct perception in a difficult figure-ground task in area V2 (hence, not in V1; Woelbern et al., 2002).
The perception-related modulations of LFP coherence found in the present study occurred in the low- and medium-frequency ranges. Coherence at low frequencies (theta/alpha: 4–12 Hz) has recently been discussed in terms of long-range integrative processes (Schanze and Eckhorn, 1997; von Stein and Sarnthein, 2000). Based on findings that activity at low frequencies may be capable of mediating context dependent top-down feedback to primary visual cortex (von Stein et al., 2000), von Stein and Sarnthein (2000) suggest coupling among cortical areas in the alpha-to-theta range to be most pronounced when the brain is ‘generating [a] hypothesis about the environment’ (p. 311). This is especially necessary when bottom-up information is insufficient for a unique percept, which is also the case in our BR experiments. Hence, the decision in favor of one or the other interpretation of the stimulus can solely be made internally. The perception-related modulations in the 4–12 Hz band may therefore be interpreted as a signature of an internal hypothesis generation about the stimulus. Another necessity for long-range integration associated with low-frequency signals in our task may result from the fact that the receptive fields of our recording sites were all close to the vertical meridian. The stimulus patches in consequence covered right- and left-side visual field positions simultaneously. The cortical representation of the whole stimulus therefore requires integration of cortical activity from both hemispheres. However, on the basis of our data we cannot decide whether the low-frequency activity is associated with corresponding extrastriate or contra-hemispherical activity.
BR in V1
How can the role of area V1 in BR be seen on the background of the available findings? Earlier reports on rivalry-induced modulations along the visual cortical pathway were disparate about the role of V1. Spike rates in macaque area V1 of very few cells did show perception-related modulations, while with increasing level of the visual areas higher numbers of modulated cells were found (Leopold and Logothetis, 1996; Sheinberg and Logothetis, 1997). Corresponding fMRI BOLD measures in humans did show V1 activation being correlated with perception during rivalry, but not as strong as during the non-rivalrous reference condition (Polonsky et al., 2000). BOLD differences due to alternating perceptual states in humans, being as large during rivalry as during congruent viewing, were found in the blind spot representation of V1 (Tong and Engel, 2001). All these results, together with ours, fit into the same overall picture of V1 participating in BR, given that the BOLD signal represents mostly (subthreshold) somato-dendritic activity. A recent study confirms this view (Logothetis et al., 2001). Although indirect evidence suggests that the BOLD signal well correlates with overall spike activity (V1: Heeger et al., 2000; V5: Rees et al., 2000), a direct comparison of simultaneously recorded fMRI- and microelectrode-signals showed higher correlation of BOLD with LFP than with spike density (Logothetis et al., 2001). This could explain the differing results on signal modulation during BR in area V1 and would lead to the suggestion that these modulations are mainly subthreshold.
Latency of Modulations
The perception-related modulations that we have found appeared on average shortly before the monkeys’ decisions. The first asymmetries in favor of consistent modulations in the population data can be seen at ∼100 ms before the time of decision (corresponding interval of analysis: –156 to –32 ms). The modulation increases towards the time of decision (Fig. 6). Note that the underlying neural process may have a sharper onset, which would be smoothed due to the sliding window technique. The 100 ms between modulation of V1 activity and motor output are very short if we assume this modulation to be causative for the monkey’s decision: The mean reaction times in the congruent conditions were 395 ms for monkey H and 372 ms for monkey S. The stimulus–response latency to the supragranular V1 layers in our data is 50–55 ms (MUA monkey H = 51 ± 4.1 ms, monkey S = 52 ± 4.3 ms; LFP, H = 51 ± 4.0 ms, S = 54 ± 3.8 ms; latency was defined as post-stimulus interval after which the PSTH, normalized to pre-stimulus SD, becomes >5; data not shown). The delay from the stimulus specific activation in V1 to the motor output in this case is >300 ms. This estimate is conservative in two ways. First, most trials are shorter than the mean reaction time, since reaction time is typically asymmetrically distributed with a long-tailed end towards long reaction times. Therefore the mean overestimates reaction times. But median values are only ∼10 ms shorter in our data, so that the difference is almost negligible. Secondly and more importantly, even when assuming that the modulated V1 activity reflects perceptual states responsible for triggering decisions, it is not clear whether it is the earliest V1 activation which is relevant. Feature selectivity, like orientation tuning, is already represented in the earliest spike activities in V1 of awake monkeys (e.g. Lamme et al., 1999; Mazer et al., 2002). Therefore, the relevant information for our task (orientation discrimination) is available in the early responses in V1 during congruent stimulation. For our experiments this means that it took ∼300 ms from the moment when the relevant information was available in V1 to the monkey’s motor output. When we take this delay as reference, the interval between the first perception-related modulations in V1 during incongruent stimulation (Fig. 6) and the motor output is too short to trigger the monkey’s decision causatively.
What are potential reasons for the latency of the modulations? First, from the psychophysical data we suspect, that the highly trained monkeys could have been premature in making the decision. They may have judged the oncoming percept before perceptual dominance of one grating was exclusive. Measures reflecting the perceptual state of dominance then would be late. Since we were aiming for signal components potentially causative for the monkeys’ decisions, our stimulation protocol does not allow evaluating later time intervals after the monkey pushed the key. Hence, we can not determine whether the increase in the perceptually related modulation continues after decision. Secondly, the perception-related modulations described in this study may not originate in area V1, but instead reflect top-down influence from higher visual areas. Such feedback projections can act as fast as the internal processing in area V1 interactions (Hupé et al., 2001). However, if feedback is responsible for the observed modulations the delay considerations on the basis of feed-forward circuits given above would not be relevant here (more about possible top-down effects below).
Eye versus Object Rivalry
Ocularity of the neurons revealing perception-related activity has attained a key role in determining the relevance of the inter-ocular competition concept of rivalry on the one hand (eye rivalry: Lehky, 1988; Blake, 1989; Lee and Blake, 1999; Tong and Engel, 2001) and the concept of competition among high level percepts on the other hand (object rivalry: Kovács et al., 1996; Logothetis et al., 1996; Ngo et al., 2000). On the background of the huge variety of seemingly contradictory psychophysical and neurophysiologic findings on this issue, insight gains ground that rivalry may be disentangled at several levels of processing, depending on stimulus properties (Bonneh et al., 2001; Blake and Logothetis, 2002). With respect to stimulus size and uniformity we used stimuli with high perceptual coherence, which according to Bonneh et al. (2001) tend to evoke object rather than eye rivalry. Consistently, we found no difference in the strength of ocular dominance between those channels showing orientation-selective modulation with perception (comparing perception of a horizontal versus vertical grating) and the remaining recording sites. Neither did we find perception-related modulations with respect to ocular selectivity (comparing perception of the left versus right stimulus). On the other hand, we used high contrasts which are more probably subjected to eye rivalry, according to the findings of Lee and Blake (1999). However, our data give no indication on eye rivalry.
Extrastriate Impact on Area V1?
From our data we can not decide about the physiological origin of the LFP modulations. Nevertheless, we consider feed-forward influence from LGNd unlikely to be responsible for it, because a previous study in awake monkey LGN failed to show differential activation due to congruent and incongruent stimulation (Lehky and Maunsell, 1996). However, in these experiments the monkeys’ perception was not monitored. More generally, a direct feed-forward solution for resolving the sensory ambiguity seems not plausible to us for two reasons. First, the latency of the observed effects in our data is contra-indicative for such a view, (although, in principle, mechanisms based on slower thalamocortical interactions can not be ruled out). Secondly, we typically recorded from the upper layers 2/3 of area V1. If BR is in any way resolved at the level of V1 or earlier and this information is transmitted to higher visual areas, then this should be measurable in spike activities of layer 2/3, since these are the main feed-forward output layers to higher cortical levels (e.g. Rockland and Pandya, 1979).
In our view this makes feedback of perception-related signals from extrastriate cortical areas a plausible explanation for the LFP modulations. This also fits with functional imaging studies on BR which reported that several extrastriate areas are modulated by perceptual alternations. These include the fusiform face area (FFA) and the parahippocampal place area (PPA), when using rivalry between faces and houses (Tong et al., 1998). Early ventral stream (V1–V4) was shown to reflect perceptual alternations with the less complex grating stimuli (Polonsky et al., 2000). Further, occipitotemporal involvement in expression of different perceptual states during rivalry was reported by Lumer et al. (1998). Other whole-head studies in humans based on VEPs (e.g. Brown and Norcia, 1997) and MEG (Tononi et al., 1998; Srinivasan et al., 1999) lacked the spatial resolution to resolve the participating cortical sites in terms of area definitions. Sheinberg and Logothetis (1997) have shown that BR in monkeys seems fully resolved in inferotemporal cortex since single cell spike rate modulations were almost always correlated to perceptual switches in their study. The notion of extrinsic impact on V1 being responsible for the observed perception-related modulations in the present study is plausible since many extrastriate feedback connections terminate in layer 2/3 of V1 (in addition to infragranular layers) and are typically modulatory in nature (reviewed in Felleman and van Essen, 1991; Salin and Bullier, 1995). Such modulatory feedback has been reported in previous studies. For example, Mehta et al. (2000a,b) found modulations of current source density specifically in supragranular layers of V1 associated with intermodal selective attention. As in our data, the authors did not find corresponding MUA modulations in V1. In summary, influence by feedback from temporal areas on V1 during BR is conceivable — with respect to the known direct (Rockland and van Hoesen, 1994) or indirect (Felleman et al., 1997) projections and to the distribution of rivalry-associated activity found in different studies. The functional relevance, although, remains unclear.
Summary and Conclusion
Our results demonstrate that low-to-medium frequency LFPs in monkey area V1can be correlated with perceptual alternations during BR, while MUA signals are not. The question remains, in how far these signal components represent awareness-relevant activity itself, or take part in awareness-relevant selection processes, or are just an epiphenomenon of other awareness-relevant processes, not captured in the present recordings. The somato-dendritic nature of the observed modulations (LFP), together with their latency, suggests a feedback impact on V1 from other cortical areas already representing or carrying a prediction about the oncoming perceptual state. The functional role of such an impact could be to stabilize a newly established percept by supporting V1 neurons in representing features of the currently or oncoming dominant stimulus.
Support by DFG to the research group ‘Dynamics of Cognitive Representation’ (DFG EC 53/9-3 to R.E.) is greatly acknowledged. We also thank Professor R. Bauer for help in experiments, W. Gerber, A. Rentzos and A. Platzner for their support in experimental techniques, and D. Leopold for instructive discussions on BR.