We explored the neural basis for spatial color contrast (red looks redder surrounded by green) and temporal color contrast (red looks redder if preceded by green) in primary visual cortex (V1) of the alert macaque. Using pairs of stimuli, we found a subset of neurons that gave stronger responses to sequences of red and green spots and stronger responses to adjacent red and green spots. These cells combined their cone inputs linearly: for a red-ON-center cell, the sum of the OFF response to green and the ON response to red predicted the peak response to red preceded by green. These ‘color’ cells, which could underlie hue discrimination because they show cone opponency, could mediate spatial and temporal color contrast. In contrast, the majority of cortical cells, which do not show overt cone opponency but which are often orientation tuned and/or direction selective, are by themselves incapable of mediating hue discrimination. The remarkable degree of specialization shown by cells in V1, especially that of the double-opponent color cells, is discussed.
Color vision in Old World primates (including man) begins with the differential wavelength sensitivity of the three classes of cones (L, M and S) that form a mosaic across the retina (Roorda and Williams, 1999). Retinal ganglion cells that receive opponent input from the different cone classes (Wiesel and Hubel, 1966; Reid and Shapley, 1992; Dacey and Lee, 1994; Chichilnisky and Baylor, 1999) seem especially likely to underlie color opponency — the cardinal feature of color vision (Hering, 1964). How color is processed in the primary visual cortex (V1), where calculations of color contrast probably arise, is less clear.
The issue of which cells in V1 encode color has been obfuscated by the larger debate concerning specialization in the primate visual system. Early studies argued that cells in V1 were specialized to represent only a portion of the visual world (Hubel and Wiesel, 1968). For example, some cells were shown to be strongly selective for the orientation of a stimulus and were thus interpreted to contribute to ‘form’ perception; other cells were strongly selective for the direction of motion of a stimulus and were therefore thought to contribute to ‘motion’ perception; and though which cells are responsible for color was not clear, a good candidate was thought to be the ‘dual-opponent’ cells (Hubel and Wiesel, 1968; Michael, 1978a). However, subsequent studies argued that single cells ‘multiplex’ a representation of multiple aspects of the visual world [e.g. (Leventhal et al., 1995); for reviews, see works by Gegenfurtner and Lennie (Gegenfurtner and Sharpe 1999; Lennie, 2000; Gegenfurtner, 2001)]. According to this proposal, a single V1 neuron is seen to contribute to our perception of color, form and motion. Despite the obvious limitations of this theory (for example, not all cells in V1 are direction selective, making it difficult to argue that all cells contribute to motion perception), the notion of multi-plexing has been accepted by textbooks (Gegenfurtner and Sharpe 1999; Lennie, 2000) and we seem to have lost sight of the obvious specializations (and implications thereof) displayed by cortical neurons.
The idea of multiplexing is especially problematic in the realm of color processing because not all cells show overt cone opponency. And overt cone opponency would seem to be a requirement for a cell to contribute directly to hue discrimination (which we consider a necessary feature of color perception). In fact, only ∼10% of cells have been shown to respond in an opponent way to opponent colors (e.g. red vs green or blue vs yellow) (Michael, 1978a,b; Livingstone and Hubel, 1984; Ts’o and Gilbert, 1988; Conway, 2001). Moreover, our understanding of the color-coding ability of cortical neurons has been clouded by the use of less-than-optimal stimuli. For example, it is clear that the spatial context of an image shapes color perception (Albers, 1963) and yet in the pioneering (but only) investigation focusing on the temporal chromatic properties of cortical cells in primate V1, full-field stimuli were used (Cottaris and De Valois, 1998). Full-field stimuli would have confounded the contribution of spatial context. This confound is made more obviously problematic when one considers the spatial structure of cortical receptive fields. Using quantitative techniques, we recently showed directly that some cortical cell receptive fields are in fact double opponent (Conway, 2001), confirming earlier, but controversial, claims (Michael, 1978a). The double opponent name derives from the fact that these cells have both spatially structured and chromatically opponent receptive fields, a feature that makes them extremely likely candidates for the neural basis of spatial color contrast and color constancy (Daw, 1968; Rubin and Richards, 1982). Full-field stimuli would have confounded the differing chromatic tuning of receptive-field centers and surrounds of double-opponent cells, making the responses to such stimuli a challenge to interpret.
Color perception shows both spatial (simultaneous) and temporal (sequential) color contrast: red appears redder when surrounded by green or when immediately preceded by green (Hurvich, 1981; Daw, 1987; Eskew et al., 1994). Here we re-investigated both spatial and temporal receptive field structure of color cells in V1 of the alert macaque using spatially restricted stimuli (spots). We go further to measure the responses to pairs of colored spots of cone-isolating light, presented both simultaneously and sequentially, to directly address a role for these cells in spatial and temporal color contrast. The results show that only some cells in V1, and not others, could underlie spatial and temporal color contrast. This means that, as far as color is concerned, only a subset of cells in V1 likely contributes to color perception.
Materials and Methods
Experiments were conducted in alert adult male macaque monkeys. Macaques are a useful model for human color vision because psychophysical studies in them match those of humans (De Valois et al., 1974; Sandell et al., 1979). Moreover, the psychophysical results on human color matching are well predicted from the spectral sensitivities of the macaque cones (Baylor et al., 1987). Monkeys were trained to fixate within a 1° radius of a fixation spot to receive a juice reward. Data collected when the monkeys moved their eyes outside this tight fixation window were not analyzed. The monkeys were fitted with scleral eye coils; during recordings, a given monkey’s head was placed in a magnetic field. The eye position, inferred by changes in current in the eye coil, was measured (CNC Engineering, Seattle, WA). This system has a spatial resolution of 0.05°, and was calibrated at the beginning of each recording session by having the monkey look at the center of the monitor and four dots at the corners of the monitor (Livingstone et al., 1996). The monkeys had to maintain fixation for 3–4 s within the fixation window to receive a juice reward. During periods of stable fixation, average residual eye movements were less than 0.25°. These eye movements were compensated using an eye-position correction technique (Livingstone, 1998). In generating one-dimensional space–time maps, this technique affords the measurement of receptive field widths as narrow as 0.2° (Livingstone and Tsao, 1999). The resolution of the technique is finer than the receptive field subregions of the cells studied here [color cells typically have centers ∼0.5° wide (Conway, 2001)].
Stimuli were presented (in a dark room) on a computer monitor (Barco Display Systems, Kortrijk, Belgium) 100 cm from the monkeys’ eyes. Neuron responses were recorded extracellularly using fine electropolished tungsten electrodes coated with vinyl lacquer (Frederick Haer, Bowdoinham, ME) (Hubel, 1957). Action potentials from single neurons were isolated using a dual-window discriminator (BAK Electronics, Germantown, MD) after they were amplified and bandpass filtered (1–10 kHz). Only well-isolated units were analyzed. All stimuli used to measure the color interactions were cone-isolating and presented on a neutral gray background (see Results). The stimuli that were used to measure the peak L and M response modulation (see Fig. 2) were generated using high-cone-contrast stimuli, presented on different adapting backgrounds (Conway, 2001). Methods for generating cone-isolating stimuli, and a discussion of their validity are discussed elsewhere (Estevez and Spekreijse, 1982; Conway, 2001).
The responses from single units in V1 of two alert macaques were recorded using tungsten electrodes. Eye movements, measured using the eye-coil and magnetic field system, were subtracted from stimulus positions using an eye-position corrected reverse correlation technique to give stimulus positions in retinal coordinates (Livingstone, 1998). We recorded from ∼615 single cells in V1. Each cell was tested with cone-isolating spots presented on a neutral gray background (Donner and Rushton, 1959; Conway, 2001). Cone-isolating stimuli, which enable the activity of a single cone class to be modulated, take advantage of the principle of univariance: the absorption spectrum of each cone pigment is broad, which means that a given cone class can be activated to the same extent by a wide range of wavelengths just by varying the intensity of the light. A light of optimal wavelength and lower intensity can be just as effective as a light of less-than-optimal wavelength of stronger intensity. To make cone-isolating stimuli, one defines two colors that activate two of the three cones identically. A bright red, for example, would stimulate the L cones a lot and the M and S cones a little. And a dim bluish-green would stimulate the L cones a little and the M and S cones the same. Thus between these two ‘cone-isolating’ colors, only the activity of the L cone class is modulated. In most experiments, we defined one of the colors as gray. A small patch of this gray could be removed and replaced with a colored patch that stimulated two of the three cones identically. A single frame from the stimulus we used to quantify the responses gives an idea of the appearance of the stimuli (Fig. 1A): the green patch increased the activity of the M cones, and activated the S and L cones the same amount as the gray; the red patch increased the activity of the L cones, and activated the S and M cones the same amount as the gray. Using gray as one of the colors limits the cone contrast that can be achieved but it allows two cone-isolating colors to be presented simultaneously (e.g. Fig. 1A), which enabled us to map the simultaneous color interactions. In addition to stimuli that selectively increase the cone activity of a given cone class (plus stimuli), opposite-contrast cone-isolating stimuli, which decrease the cone activity of a given cone class, could be made [minus stimuli (Conway, 2001)]. We used plus stimuli to determine the cone interactions, and both plus and minus stimuli to determine cone weights (see Fig. 2).
Cells were screened with a single colored patch at a time, restricted in size to the receptive field center and surrounded by the gray background. The patch was displayed for 500 ms and then removed, and replaced with a uniform gray field. This was repeated until the responses, which were amplified on an audio monitor, could be characterized as ON responses, OFF responses or ON/OFF responses. The procedure was repeated with a different colored patch. We defined ‘color cells’ as those that showed an ON response to one color and an OFF response to the opposite color (red vs green). The stimuli used to quantify the color interactions involved presenting pairs of stimuli at a time (Fig. 1A).
Sixty-five cells were classified as ‘color’ cells: they showed opposite-sign responses (excitation vs suppression) to red (L-isolating) and to green (M-isolating) spots (Fig. 1B, left plot). Because cells were selected for further study based on a screen for L versus M opponency, we were precluded from finding blue–yellow cells, which would show the same sign of response to M and L stimuli (assuming blue–yellow cells receive inputs from both L and M cones). We did, however, quantify the responses of 12 cells that did not show red–green opponency (and which we confirmed were not blue–yellow cells). We classified these as ‘non-color’ cells; all of them were orientation-selective complex cells (Hubel and Wiesel, 1968). Non-color cells gave responses of a similar sign to red and green spots, though the magnitudes of the responses were not always equal so these cells might have been capable of relaying some color information (Fig. 1B, right plot). But because these cells did not show opponent responses they could not be involved directly in hue discrimination, justifying our designation of them as ‘non-color’. It should be emphasized that we only mapped 12 such cells. These were not screened for anything other than lack of cone opponency, yet represent only a small random sample of the non-cone-opponent cells. Regardless, all of the non-color cells responded in a similar way (see Fig. 1C). For a survey of the chromatic properties of a large sample of cortical cells, the reader is directed to Johnson et al. (Johnson et al., 2001), Lennie et al. (Lennie et al., 1990) and Livingstone and Hubel (Livingstone and Hubel, 1984).
We mapped the spatial and temporal color interactions of 36 color cells and 12 non-color cells. Pairs of cone-isolating stimuli were presented at random locations along a range of locations running through the receptive field center (the stimulus range, Fig. 1A). Color cell receptive fields were sometimes not circularly symmetric, but rather asymmetric or coarsely oriented (Fig. 1C, left plot) — the spatial structure of the receptive fields was determined before mapping the interactions (Conway, 2001). Thus if a cell had a receptive field that was coarsely oriented, the stimulus range was placed so that the stimuli matched the cell’s orientation preference and could optimally stimulate the cell. Thousands of frames were used to map a given cell; each frame (e.g. Fig. 1A) in a given stimulus run was the same duration (between 25 and 100 ms).
Spatial Color Interactions
To determine the simultaneous color interactions, from a continuous history of spike and stimulus timing, we reverse correlated the response to every spatial configuration of the pair of bars along the stimulus range, accounting for the visual latency (Livingstone and Tsao, 1999; Conway, 2001). The responses are plotted in (x–y) coordinates, with zero corresponding to the center of the stimulus range, according to a color scale bar (Fig. 1C). For example, the maximum firing rate of the color cell shown in Figure 1C occurred when the L bar was at position 0.75 (x-axis) and the M bar was at 0.5 (y-axis). The arms of the cross in the cone-interaction maps represent the responses of the cell to those occasions when one of the stimuli was in the receptive field and the other was not. For example, the response shown at (0.75, 1.5) was the response of the cell to the L-cone-isolating stimulus in the center of the receptive field and the M-cone-isolating stimulus well outside the receptive field. Note that the cell’s peak response to the M-cone-isolating stimulus (y = 0.5) was offset to the cell’s peak response to the L-cone-isolating stimulus (x = 0.75); moreover, there was only one location of peak response to the M-cone-isolating stimulus (shown by the single horizontal band centered on y = 0.5). This is consistent with a single M+/L– receptive-field flank, to the left of the L+/M– receptive-field center (see the receptive-field schematic for the color cell, Fig. 1C). Cells that had receptive fields with a single chromatically opponent region [analogous to Type II cells in the lateral geniculate nucleus (Wiesel and Hubel, 1966)] showed just the horizontal or the vertical arm of the cross.
Twenty out of 36 color cells showed chromatically opponent surround responses, with interaction maps similar to the one in Figure 1C (left plot). In all of these cells, stimulating both center and surround simultaneously, with adjacent red and green spots, elicited stronger activity than stimulating either subregion alone. This was reflected in the cone-interaction maps as increased activity along one or both diagonals parallel to the x = y diagonal, and is shown in the maps as a blob redder than the color in either arm of the cross. In Figure 1C, left plot, the increase is only below the diagonal, which reflects the presence of only one receptive-field flank. The elevated response to adjacent L and M bars shows directly that these cells are suited to mediate spatial color contrast. All non-color complex cells, on the other hand, showed a completely different response pattern in their interaction maps (Fig. 1C, right plot): increased activity along the x = y diagonal and decreased activity along flanking diagonals. Thus even though non-color cells can show differently weighted inputs from different cone classes (Fig. 1B, right plot), they are not suited to color vision: they lack cone opponency (Fig. 1B, right plot) and they lack the ability to mediate spatial chromatic contrast (Fig. 1C, right plot). That only 20 of the 36 color cells showed significantly larger responses to adjacent red and green spots reflects the underlying variability in surround strength among color cells (Conway, 2001).
The simultaneous color interaction maps of all the color cells showed little or no change in activity from baseline along the entire x = y diagonal, indicating that the cells did not respond to overlapping L + M bars (resulting in a yellow stimulus) anywhere in the receptive field. Thus the cells were chromatically opponent throughout the receptive-field center and any surrounding subregions. This cone opponency is clear in Figure 1B, left plot, where the response of the cell to center stimulation with L + M (plotted in yellow) shows little deviation from baseline [unlike the response to center stimulation with L alone (Fig. 1B, red trace) or M alone (Fig. 1B, green trace)].
The fact that the response to the yellow stimulus was negligible shows that the responses to L+ and to M+ were not only opponent but also balanced. This is a valid conclusion because the cone contrast of the L and M stimulus were matched (Fig. 1, legend). Thus even though the suppression caused by the M stimulus did not appear as strong as the excitation caused by the L stimulus (possibly because measurement of suppression is limited because the cell’s firing cannot drop below zero), the suppression by M was strong enough to oppose the strong excitation to L.
Red–green cells tend to show opponent and balanced responses to L and M, as shown in Figure 2, where higher cone-contrast stimuli, using different colored backgrounds, were used. In Figure 2, instead of using reduction of firing as a measure of suppression (which can underestimate the extent of suppression because of rectification), we assumed that the suppression was equal in magnitude but opposite in sign to the excitation produced by the opposite-contrast cone-isolating stimulus (Tolhurst and Dean, 1990; Ferster, 1994). For example, the suppression by an M+ cone-isolating stimulus (a stimulus that increases the activity of the M cones) would be equal in magnitude, but opposite in sign, to the peak excitation by an M– cone-isolating stimulus (a stimulus that decreases the activity of the M cones) (Conway, 2001). Figure 2 shows the peak center response to the L+ stimulus (x-axis) versus the peak center response to the M– stimulus (y-axis) for the L-ON-center cells (squares), and the peak center response to the M+ stimulus (y-axis) versus the peak center response to the L– stimulus (x-axis) for the M-ON-center cells (circles). The slope of the relationship between L and M response for the M-ON-center cells is –0.903 (r2 = 0.4); and that for the L-ON-center cells is –1.02 (r2 = 0.3); and that for both populations combined is –0.9 (r2 = 0.9). The slope, which is almost –1, shows that the cells receive almost equal and opposite inputs from L and M cones. The slope may be slightly shallower than –1 because the S-cone contribution, which usually opposes the L-cone contribution in red–green cells (Conway, 2001), is not accounted for.
Thus, spatial cone-interaction maps are a useful means of classifying cells in V1. Some cells, which we call color cells, show decreased activity along the x = y diagonal and often show increased activity along flanking diagonals; other cells, which we call non-color cells, show increased activity along the x = y diagonal and decreased activity along the flanking diagonals (Fig. 1C). Color cells show a pattern that suggests they underlie color opponency and spatial color contrast, justifying our designation of these cells as a distinct and specialized class of cell.
Temporal Color Interactions
From the same spike train used to map the simultaneous color interactions, we determined the temporal pattern of response to each color at each location along the stimulus range. This was possible even though two different stimuli were presented in any given frame because the spatial relationship between the stimuli along the stimulus range was random. Thus, by only considering the response to one color stimulus, we averaged out the response to the other color. We confirmed that this was valid by mapping a few cells with just one colored stimulus at a time. Of course it would have been better to map all cells in this way, but it would have taken at least three times as long, and the responses would be from different spike trains. The resulting ‘space–time’ maps are analogous to a set of post-stimulus time histograms to stimulation at each point along the stimulus range, where the stimulus range is along the x-axis and the time after stimulation is on the y-axis (Fig. 3A–C). Importantly, these maps are distinct from typical reverse-correlation maps because they show the response to every stimulus, regardless of whether or not an action potential occurred. Conventional reverse-correlation maps determine the average stimulus that preceded each action potential. To acknowledge this distinction, these maps are probably better described as ‘forward-correlated’, although admittedly that description is cumbersome.
In the space–time maps, activity is mapped as a function of the position of each stimulus whose onset is assigned to be at time = 0, but other stimuli were presented immediately following each stimulus (though at random positions relative to the time = 0 stimulus). The other stimuli can elevate the baseline activity at intervals corresponding to the temporal frequency with which stimuli were presented. This is reflected in the maps as ‘non-specific’ bands (to borrow a term from the molecular biologists). Thus a black/blue region in a map only represents significant suppression if it is darker than the average color outside the receptive field, for that delay (e.g. the asterisk in the L map and the arrow in the M map, Fig. 3A); similarly, red represents excitation if it is above background for a given delay (e.g. the asterisk in the M map and the arrow in the L map, Fig. 3A).
Only at the visual latency (∼50 ms) does the pattern of activity across the stimulus range reflect the cell’s receptive field. A cell may respond not only to the onset of a stimulus, but also throughout the duration of the stimulus and/or to the cessation of the stimulus. A cell’s response to the cessation of a stimulus will be represented in the map at a delay corresponding to the sum of the visual latency of the cell (∼50 ms) and the duration of the stimulus (25–100 ms). Note that we use the term suppression to acknowledge the fact that we do not know the mechanism for the decrease in firing rate: we do not know if it is inhibition or withdrawal of excitation.
All the cells in Figure 3 were color cells; for example, at a short delay, the cell shown in Figure 3A rarely fired in response to a red spot (asterisk, L map) but often did in response to a green spot (asterisk, M map). Some cells (20/36) showed clear spatial color opponency: in the receptive field region surrounding the center, the pattern of probability was reversed (e.g. Fig. 3B, arrowhead). The combination of chromatic and spatial opponency earns these cells the designation double opponent; these were the cells that showed increased responses to adjacent red and green spots (Fig. 1C, left plot). The remaining cells (16/36) showed little sign of spatial opponency: the region surrounding the center was not modulated by >1 SD from the background. These cells may be best described as cortical Type II cells (Wiesel and Hubel, 1966), though the distinction between Type II and double opponent is somewhat arbitrary because the cells show a range of surround strengths (Conway, 2001), which may be influenced by the cone contrast of the stimuli.
Color cells often showed not only spatial color opponency but also temporal opponency: most color cells (32/36) that were suppressed by the onset of a stimulus were excited by the cessation of it, while cells that were excited by the onset of a stimulus were usually suppressed by the cessation of it (Motokawa, 1962; Poggio et al., 1975; Livingstone and Hubel, 1984; Conway, 2001). For example, the cell shown in Figure 3A gave an ON response to green spots (asterisk, M map), which was followed by suppression at stimulus cessation (arrow, M map). In contrast, red spots caused suppression, which was followed by a rebound OFF discharge upon stimulus cessation.
The temporal pattern of the responses suggests that these cells might give optimum responses to sequences of appropriately chosen, differently colored spots: the OFF discharge (e.g. center response to red spots, Fig. 3A) might add to or facilitate the ON response to a subsequent stimulus (e.g. the center response to green spots, Fig. 3A). This modulation could be a mechanism for temporal color contrast: in the same way a briefly flashed red spot appears redder when preceded by a briefly flashed green spot (Eskew et al., 1994), a red-ON-center cell may be expected to fire more strongly to red when red is preceded by green. In the next set of experiments, we tested for temporal color interactions directly. For the quantitative studies, all stimuli were restricted to the center. For each cell, the stimulus that produced excitation with the shortest latency (in the center of the receptive field) was designated the reference stimulus. Thus for the cell in Figure 3A the green (i.e. M+-cone-isolating) spot was the reference stimulus. Most color cells (32/36) showed an increase in response to the reference stimulus if the stimulus was immediately preceded by a stimulus of opposite color, red (L+) versus green (M+) (Fig. 3D). In fact, the response to sequences was predicted by the linear sum of the peak ON response to the reference stimulus plus the peak OFF response to the opposite color (Fig. 3E; slope = 0.95, r2 = 0.8). This is interesting because modeling efforts of color vision have shown that the machinery responsible for hue discrimination could be linear (Wyszecki and Stiles, 1982; Hurlbert and Poggio, 1988). As Figure 3E shows, color cells, like simple cells in the cat (Ferster, 1994), blue–yellow retinal ganglion cells (Chichilnisky and Baylor, 1999) and most cells in monkey V1 [(Lennie et al., 1990); but see De Valois et al. (De Valois et al., 2000)] do in fact sum their inputs linearly. But, as Wielaard et al. (Wielaard et al., 2001) have pointed out, the fact that any cortical cell can respond in a linear way is somewhat remarkable given the non-linear nature of the thalamic input and the cortical network itself.
For all 36 cells for which we measured temporal color interactions, we also quantified the percent change of the response to the reference stimulus produced by different preceding stimuli (Fig. 4). A preceding stimulus of opposite color increased the response to the reference stimulus, while a preceding stimulus of identical color decreased the response, as one would expect if these cells were responsible for temporal chromatic contrast (Eskew et al., 1994). This was not true for non-color cells. In fact, the response of non-color cells to the second (i.e. reference) stimulus of a sequence, regardless of the color of the first stimulus, was always reduced when compared with the response to the reference stimulus presented alone, reminiscent of forward masking of luminance stimuli (Macknik and Livingstone, 1998).
Finally, in a few cells we also determined the responses to temporally shifting stimuli. For example, we measured the response of the cell shown in Figure 3C to a red bar placed first in the receptive field center and then in the receptive-field flank on the right of the receptive-field center. The response to this two-bar-apparent motion was predicted by the sum of the OFF response to the center and the ON response to the surround. The peak response to this stimulus was much higher than the peak response to a red stimulus ‘moving’ in the opposite direction.
Though calculations of temporal and spatial color contrast are begun in V1 (Michael, 1978a,b; Livingstone and Hubel, 1984; Thorell et al., 1984; Conway, 2001; Johnson et al., 2001), it is debated which of the cells in V1 subserve color perception. Some contend that almost all V1 cells are capable of subserving multiple aspects of the visual world, including shape, color and motion (Leventhal et al., 1995; Lennie, 2000); while others argue that only a subset of cells in V1 are specialized to encode color (Livingstone and Hubel, 1984; Engle and Furmanski 2001). To investigate mechanisms for color contrast, we focused on a subset of cells in V1 that are cone opponent. These cells are not direction selective (see discussion below) and are only coarsely (if at all) orientation selective (Conway, 2001; Johnson et al., 2001), suggesting that they are specialized to process one visual attribute — hue. We therefore call these ‘color’ cells. Many color cells also have spatially opponent receptive fields, which would enable them to contribute to calculations of color constancy (Richards and Rubin, 1982). That these cells are ‘double opponent’ has been shown directly by quantitative receptive field mapping with small spots (Conway, 2001) and is even reflected in the responses to shifting sine-wave gratings [(Johnson et al., 2001); but see Lennie et al. (Lennie et al., 1990)]. These quantitative studies validate the original qualitative observations of Charles Michael (Michael, 1978a,b) and behoove editors to restore a description of double-opponent cells to standard textbooks of neuroscience. Regardless, the responses of V1 cells to simultaneous and sequential pairs of oppositely colored stimuli had not previously been quantified, an experiment necessary to address the role for these cells in spatial and temporal color contrast. Here we did this experiment. We found that many color cells gave stronger responses to sequences of red and green spots and stronger responses to adjacent red and green spots. The ability of a given cell to encode both spatial and temporal color contrast may well contribute to the interactions of spatial and temporal contrast that are evident perceptually (Fig. 5). Non-cone-opponent cells, on the other hand, responded in a way that was inconsistent with a role for them in spatial or temporal color contrast.
The color cells were selected for study because they gave explicitly opponent responses to red and green spots. Such cells respond well to a patch of color even if the luminance of the patch is matched to that of the surround (as long as the surround is a different color; Fig. 6). Responses to such ‘equiluminant’ stimuli have been taken as necessary and sufficient evidence that a cell is color coding. But this has led to some confusion because many non-cone-opponent cortical cells, which are responsive to luminance borders, are also responsive to equiluminant color borders (Gouras and Kruger, 1979; Thorell et al., 1984; Hubel and Livingstone, 1990; Johnson et al., 2001). These color-luminance cells (Johnson et al., 2001), which are usually very sharply orientation tuned, have been interpreted as evidence that single cortical cells ‘multiplex’ form and color (Gegenfurtner, 2001; Johnson et al., 2001). However, it seems unlikely that these cells contribute directly to calculations of hue because they lack explicit signs of cone opponency — they do not give ON responses to one cone-isolating stimulus and OFF responses to a different cone-isolating stimulus. So what contribution could color-luminance cells make to visual perception? Their wavelength sensitivity may contribute to form vision by enabling the detection of boundaries between different regions that reflect wavelengths differently. The detection of these boundaries, for example, would be important independent of hue discrimination in defeating camouflage and could be a useful cue to object shape. Thus a difference in wavelength reflectance may be an attribute to which many cells in the cortex respond, though different cells may be specialized to use it in different ways to encode qualitatively different aspects of the visual world. Cone-opponent cells use it to encode hue (i.e. color) and equiluminance cells use it to encode a representation of form.
But could we argue that cone-opponent cells, and not color-luminance cells, multiplex color and form? After all, cone-opponent receptive fields show spatial structure – some of them even have asymmetric receptive fields, which respond best to a colored stimulus if it is oriented in such a way to match the shape of the receptive field [Fig. 1C, and Conway (Conway 2001)]. What should we make of this spatial, and occasionally even ‘orientation’, selectivity? Should we conclude that these cells ‘multiplex’ a representation of form and color? How would we determine that the spatial information is used by the brain to represent form and not a product of sloppy wiring? Evolution might simply have not gone to all the trouble to ensure that color cell receptive fields be perfectly symmetrical. Alternatively, how can we be sure that these receptive field structures are not actually the most efficient means of representing color? Independent component analysis of color in natural scenes produces basis functions that are coarsely oriented (Tailor et al., 2001; Wachtler et al., 2001), much like some color cell receptive fields (Conway, 2001). Is it not more parsimonious to conclude that the spatial structure of color cell receptive fields exists not because color cells represent ‘form’ (i.e. high spatial acuity of the sort that enables you to read this text) but rather because spatial structure is critical to our perception of hue? Thus we suspect that double-opponent cells, which by virtue of their cone opponency are capable of signaling hue, have spatial structure not simply to signal color boundaries but to signal color itself, because, as artists continue to show us, our perception of hue is profoundly influenced by chromatic context (Albers, 1963). Moreover, the spatial resolution of color vision is relatively low (Mullen, 1985; Livingstone and Hubel, 1987), which is consistent with the relatively large receptive fields of double-opponent cells.
Some have asked if oriented double-opponent cells multiplex color and direction. A red stimulus moving from a green-ON subregion to a red-ON subregion will elicit an OFF response (from the green subregion) that will sum with the ON response (from the red subregion). But we do not think this indicates that these cells signal direction: the color cells’ space–time maps are not slanted [slanted, or ‘spatio-temporally inseparable’, space– time maps are thought to be fundamental to motion perception (Adelson and Bergen, 1985)] and the ‘direction selectivity’ of color cells is quantitatively and qualitatively different from that of direction-selective cells (Livingstone et al., 2000) (B.R. Conway and M.S. Livingstone, submitted for publication). Furthermore, even if these orientation-selective cone-opponent cells could multiplex color and direction, are there enough of them to represent the entire visual world? Probably not given that the total sum of all cortical color-opponent cells, in all their manifestations, would barely be sufficient to encompass the entire visual world (we estimate that they account for <10% of cortical cells). So, if cortical cells genuinely do not multiplex color, form and direction, then how do we achieve a unified perception of the visual world? Perhaps subsequent visual areas sample from the various specialized populations of cells in V1 [and there is ample evidence that this is true: Van Essen et al. (Van Essen et al., 1992)]. Thus in the same way trichromacy exists at the earliest stage of visual processing but subsequently gives rise to opponency, so specialization exists in V1 but subsequently gives rise to ‘multiplexing’.
In summary, the choices of stimuli that we make will ultimately shape how clearly various physiological specializations can be seen. Thus there is a tradeoff between doing large population studies, where large numbers of cells are mapped with a small battery of stimuli and one risks gaining only a muddy appreciation of the various cortical specializations, and screening for cells using specialized stimuli, where one risks making inaccurate descriptions about the total population of cortical cells. Perhaps it is an appreciation of the benefit of both types of studies that is leading to a convergent understanding of the mechanism of color processing in V1 (Conway, 2001; Engel and Furmanski, 2001; Johnson et al., 2001).
We wish to thank J. Assad, R. Born, K. Duffy, D. Tsao, R.C. Reid and G. Yellen for comments on the manuscript. This work was funded by NIH grants EY 13135 (M.S.L.) and EY 00605 (D.H.H.), Harvard’s Mind Brain and Behavior Initiative (M.S.L.) and the Natural Sciences and Engineering Research Council of Canada (B.R.C.).