We studied the responses of neurons in area V1 of marmosets to visual stimuli that moved against dynamic textured backgrounds. The stimuli were defined either by a first-order cue (‘solid’ bars, which were either darker or lighter than the background) or by a second-order cue (‘camouflaged’ bars, defined only by coherent motion). Forty-two per cent of the neurons demonstrated a similar selectivity for the direction of motion of the solid and camouflaged bars, thereby characterizing a population of cue-invariant (CI) cells. The other cells either showed different selectivity to the movement of solid and camouflaged bars (non-cue-invariant, or NCI cells), or responded equally well to movement in all directions. CI neurons, which were rare in layer 4, tended to have larger receptive fields and to be more strongly direction selective than NCI cells. Although V1 neurons tended to show maximal responses to camouflaged bars that were longer than the ‘optimal’ solid bars, many CI neurons preferred first- and second-order stimuli of similar lengths. Finally, the activity evoked by the camouflaged bars was delayed in relation to that evoked by solid bars. These results demonstrate that motion CI responses are relatively common in primate V1, especially among a population of strongly direction-selective neurons. They also indicate that this response property may depend on feedback from extrastriate areas, or on complex intrinsic interactions within V1.
In many natural situations, objects are perceived as lighter or darker than the background against which they appear. In this case, it is said that they are defined by first-order (luminance-based) cues (Albright, 1992). However, the visual system is capable of registering shapes even when the luminance of the objects and their background are carefully matched. In this situation, shape recognition is achieved on the basis of second-order cues, which may include differences in binocular disparity, texture, or relative motion. The ability of the visual system to perceive a shape, irrespective of the image feature that defines its boundaries, is referred to as cue invariance (Zipser et al., 1996; Baker, 1999). Previous studies have demonstrated the existence of neurons in the visual cortex whose physiological responses appear to reflect cue invariance: their selectivity to orientation or direction of motion is similar, irrespective of different image cues being used to define the stimulus borders. Neurons showing cue-invariant (CI) responses to various types of second-order stimuli have been reported in a number of visual areas in both cats (Hammond and MacKay, 1977; Redies et al., 1986; Leventhal et al., 1998; Mareschal and Baker, 1998; Khayat et al., 2000) and monkeys (Albright, 1992; Sary et al., 1993; Chaudhuri and Albright, 1997; Marcar et al., 2000; Ramsden et al., 2001). The present study explores the ability of cortical neurons to respond to a particular class of second-order stimuli: objects that are defined solely by coherent motion.
The coherent motion of a figure’s textural elements is a powerful cue for figure–ground segregation, allowing the visual system to detect a camouflaged object even when its luminance, wavelength and textural characteristics are the same as those present in the background. Studies in macaque monkeys have suggested that some neurons in both striate cortex (V1) and the middle temporal area (MT) show similar response selectivities, irrespective of whether they are tested with ‘first-order’ bars, or bars that move against stationary backgrounds of matching texture and luminance (Albright, 1992; Olavarria et al., 1992; Chaudhuri and Albright, 1997). However, the role of primate V1 in the processing of second-order stimuli deserves further study. For example, it has been reported that neurons showing cue invariance to the characteristics of moving patterns are first observed in significant numbers at the level of the second visual area (V2) and may be rare even in MT (Leventhal et al., 1998; O’Keefe and Movshon, 1998). These observations would parallel the findings of studies using other types of second-order stimuli, such as static boundaries defined by differences in motion between two regions of the visual field. In this case, the orientation of the boundary is encoded in a CI manner in V2 and ‘ventral stream’ areas, but not in V1 or MT (Sary et al., 1993; Marcar et al., 1995, 2000).
We investigated the contribution of New World monkey V1 to the CI analysis of moving stimuli by quantifying single neuron activity in response to bars of different lengths, orientations and velocities. In alternate tests, these bars were defined either by a difference in luminance in relation to the background (a first-order, ‘luminance’, or ‘solid’ bar) or by coherent motion alone (a second-order, ‘kinetic’, ‘noise’, ‘textured’, ‘motion-induced’, or ‘camouflaged’ bar) (Hammond and MacKay, 1975; Regan, 1986; Olavarria et al., 1992, Chaudhuri and Albright, 1997; Leventhal et al., 1998). Our results demonstrate unequivocally that a sub-population of V1 cells can encode the characteristics of moving stimuli in a CI manner, therefore mirroring those obtained in the behaving Old World monkey (Chaudhuri and Albright, 1997). Moreover, the data reveal that motion CI V1 cells are distinctive in terms of their ‘classical’ response properties and laminar distribution, and reveal the temporal characteristics of V1 responses to moving motion-defined patterns. These observations support the view that the encoding of different types of second-order boundaries (e.g. static versus moving) may depend on different computational steps and involve different sets of cortical areas.
Materials and Methods
Six adult New World monkeys (Callithrix jacchus, the common mar-moset) were used in non-recovery experiments, which were conducted following the ethical guidelines established by the National Health and Medical Research Council. The animals were initially anaesthetized with i.m. injections of ketamine (50 mg/kg) combined with xylazine (3 mg/kg). A tracheotomy was performed and a tracheal tube inserted to enable artificial ventilation. The marmoset was then placed on a thermostatically controlled heating pad and its head was positioned in a stereotaxic frame. The dura mater overlying the dorsal cortical surface was exposed and covered with a thin layer of silicone oil in order to prevent desiccation. After all surgical procedures were completed, the animal was administered an i.v. infusion of pancuronium bromide (0.1 mg/kg/h), combined with sufentanil (6 μg/kg/h) and dexamethasone (0.4 mg/kg/h), in a saline/glucose solution. This induced muscular paralysis while maintaining anaesthesia. The animal was artificially ventilated with a gaseous mixture of nitrous oxide and oxygen (7:3). The level of anaesthesia was monitored using electrocardiographic criteria and the level of cortical spontaneous activity. Administration of atropine (1%) and phenylephrine hydrochloride (10%) eye drops resulted in mydriasis and cycloplegia. Application of contact lenses with a curvature radius of 3.4–3.6 mm focused the eyes on the screen of a computer monitor located 40 cm in front of the animal.
Parylene-coated tungsten microelectrodes with an exposed tip of 10 mm were inserted in the vertical stereotaxic plane, through small cuts in the dura mater. The electrode penetrations were aimed at the caudal portion of the dorsal bank of the calcarine sulcus, which represents a region of the lower visual field around 10–15° of eccentricity (Fritsches and Rosa, 1996). Receptive fields of single cells were initially mapped interactively, using hand-controlled stimuli moved across the computer screen. Following the determination of the receptive field centre, the neuronal response properties were studied quantitatively. Computer-generated visual stimuli were presented on a 20′′ Apple Scan monitor (Apple Computer Inc., Cupertino, CA; refresh rate of 75 Hz, resolution 1024 × 768 pixels).
The stimulation paradigm (Fig. 1) comprised a background pattern, which filled the entire screen (subtending 51.2 × 38.4° of visual angle) and moving bars of different orientations, sizes and velocities, which were swept over the cell’s receptive field. The background pattern was always formed of a dynamic random dot ensemble, in which the luminance of each element fluctuated between ‘black’ (0.3 cd/m2) and ‘white’ (24.8 cd/m2) at a temporal frequency of 1.4 Hz. The space- and time-averaged luminance of the random dot ensemble was 6.1 cd/m2. The moving bars were of two types: first-order bars, which were uniformly black (90% contrast relative to the background average) or white (60% contrast), and second-order bars, which were filled with the same dynamic pattern as the background. For simplicity, in this paper the first- and second-order stimuli will be referred to as ‘solid’ (Olavarria et al., 1992) and ‘camouflaged’ (Regan, 1986) bars, respectively. In this type of paradigm, there are no fixed spatial or temporal cues that define the moving camouflaged bar. At any particular moment, the average luminance across all textural elements forming the moving bar was the same as that of any region of the background pattern. Moreover, because each of the textural elements of the bar and background fluctuated, their average luminance across time was the same as that of any other point on the screen. Thus, the definition of the camouflaged bar’s boundaries was solely based on the fact that, at any instant, its textural elements were moving in the same direction and at the same speed. The random and dynamic nature of the stimulus and background pixels minimized the probability of ‘false positive’ results due to local inhomogeneity of static ‘noise’ patterns (Mason, 1976; Hammond, 1991). For as long as the camouflaged bars were moving, they were easily perceived; however, once movement ceased, they quickly blended into the background. For both camouflaged bars and background, the initial grey value of each pixel was generated randomly, using an 8-bit palette, while their size was kept constant (12 min arc). This value was chosen on the basis of preliminary tests (Fig. 2), which found that cells in the studied part of V1 did not respond in phase with the temporal modulation of individual textural elements (and therefore were unlikely to be processing each textural element as a separate ‘stimulus’). The temporal frequency of each element of the dynamic random pattern (1.4 Hz) was also kept constant across experiments. Judging by the ability of human observers to perceive the contour of the camouflaged bars moving against the background, the exact value of this parameter was not critical, at least within a range of values between 0.1 and 3 Hz.
Each neuron was initially tested with solid bars (both black and white) oriented perpendicular to their axes of motion, in order to characterize its selectivity for stimulus direction, speed and length, as well as any separation between ‘on’ and ‘off’ subregions of the receptive field, if present. The bar width was kept constant throughout the testing of a given neuron, at a value less than half the width of the hand-mapped receptive field (typical values were in the range of 0.5–0.8°). Following this initial categorization of the cell’s response properties, a second battery of tests was conducted, in which responses to solid (either black or white, depending on which stimulus elicited the stronger response) and camouflaged bars were compared in paired trials. In this second series of tests, only one parameter (direction, speed or length) was varied at a time, while the others were kept constant at the ‘optimal’ value for the neuron, as defined in the course of the initial tests. Six to ten repeats of each condition were used, randomized amongst other variations of the same parameter. Inter-trial intervals of 3 s were adopted.
At the end of the experiment, the animal was given a lethal dose of sodium pentobarbitone (100 mg/kg) and perfused transcardially with 0.9% saline, followed by 4% paraformaldehyde in 0.1 M phosphate buffer (pH 7.4). Alternate sections (40 mm) were stained for Nissl substance (with cresyl violet) and myelin (Schmued, 1990), allowing the reconstruction of electrode tracks and architectonic boundaries. The nomenclature of Hassler (Hassler, 1966) was used to describe the laminar organization of V1, as it allows a more appropriate comparison between homologous layers across species (Casagrande and Kaas, 1994).
The responses of each cell were converted into peristimulus time histograms (PSTHs) with a 10 ms bin width, which formed the basis of all subsequent analyses. The cell’s response window (corresponding to the period of time during which the stimulus crossed its receptive field) was then defined as the region of the PSTH around the response peak in which the cell’s activity was significantly above the spontaneous activity. Depending on the cell’s selectivity for stimulus speed and its receptive field width, this window was between 180 and 1410 ms.
Quantitative measures were used to assess neuronal responses to solid and camouflaged bars. Indices of direction and axis of motion bias were calculated using the vector sum method (Leventhal et al., 1998). Two sets of calculations, corresponding to the vector sum across 360 and 180° ranges, were performed (in the latter case, the neuronal responses to movement in two opposing directions were added). The normalized values obtained from the calculation of the vector module over the 360 and 180° ranges were used as estimates of the cell’s direction index (DI) and axis of motion index (AI), respectively; because bars were always presented at an orientation perpendicular to the axis of motion, these tests did not accurately assess orientation selectivity as an independent measure (Albright, 1984). Based on these indices, the selectivity of each neuron to stimulus motion was classified as unidirectional, bidirectional, or pandirectional (Albright, 1984). A cut-off value of 0.2 was adopted when classifying neurons as unidirectional (DI = 0.2), bidirectional (DI < 0.2, AI = 0.2), or non-selective (AI and DI < 0.20). The vector angle resulting from vector summation over 360° was used as an estimate of the neuron’s optimal direction of motion, while the angle resulting from the sum over 180° was used as an estimate of the cell’s preferred axis of motion.
For each cell, measurements of response as a function of length were fitted with polynomial functions, whose fitting was constrained by the requirement that the curve crossed the origin (i.e. the level of spontaneous activity) at length zero. For end-inhibited neurons, the optimal bar length was the value along the x-axis corresponding to the peak of the fitted polynomial curve, while for cells without significant end-inhibition this was the value at which the curve reached an asymptote.
Statistical analyses were conducted using SPSS 6.1.1 for the Macintosh. Parametric or non-parametric tests were used for different statistical comparisons, depending on whether or not the distributions deviated significantly from normality (as assessed using the Kolmogorov– Smirnov goodness-of-fit test) and on the uniformity of the variances.
This study is based on the analysis of the responses of 81 V1 neurons, which were isolated for the entire period needed to complete the testing procedures. The sample includes 59 units that were selective to the orientation and/or direction of motion of the solid bars (unidirectional and bidirectional cells; 73% of the sample) and 22 pandirectional cells (27%). Provided that the stimuli were presented at near-optimal direction of motion, length and speed, each of the 81 neurons responded to the movement of the camouflaged bars, as well as that of the solid bars (responses to both types of stimulus being over two standard deviations above the spontaneous activity).
As detailed below, our results indicate that V1 neurons vary in terms of both the relative strength of their response to the solid and camouflaged bars, and the degree to which their response selectivity is independent of the stimulus type. None the less, the findings confirm the existence of a substantial population of neurons in V1 that show CI motion selectivity (Chaudhuri and Albright, 1997). Our results also demonstrate that these CI neurons tend to differ from non-cue-invariant (NCI) neurons in terms of their classical response properties, such as strength of direction selectivity and receptive field size. Finally, they show that V1 neuronal responses to camouflaged bars are delayed relative to responses to solid bars, suggesting that these rely on different sets of computational steps.
Selectivity to Direction of Motion
We were particularly interested in establishing the existence of a population of V1 neurons that demonstrates cue invariance with respect to the motion parameters of first- and second-order stimuli. Therefore, for the purposes of data analysis, cells were classified as CI or NCI, primarily on the basis of their direction tuning curves. More specifically, in order for a neuron to be classified as CI, three criteria had to be met:
the neuron’s responses had to be selective with respect to the stimulus axis and/or direction of motion of the solid bars (pandirectional cells therefore being excluded);
the same type of motion selectivity (unidirectional or bidirectional) had to be revealed in tests using solid and camouflaged bars of otherwise similar characteristics;
the optimal direction of motion (for unidirectional cells) or axis of motion (for bidirectional cells) had to be similar (<30° deviation), whether assessed using solid or camouflaged bars.
Overall, 42% (34/81) of the neurons in our sample were classified as CI, by virtue of meeting all these criteria. Examples of responses of CI neurons, both unidirectional and bidirectional, are illustrated in Figure 3. By contrast, the responses illustrated in Figure 4A,B are representative of the category of NCI neurons (25/81, 31% of the sample). More commonly (15 units), NCI neurons showed a clear selectivity for movement of a solid bar along particular directions, while the responses to a camouflaged bar were relatively unaffected by changes in the direction of motion (Fig. 4A). Less frequently (10 units), the responses to both solid and camouflaged bars were direction or axis of motion selective, but had different characteristics. This included cells which showed unidirectional selectivity with one type of stimulus and bidirectional selectivity with another, as well as cells whose optimal directions/axes of motion differed by 30° or more (Fig. 4B). Among the 22 neurons with pan-directional responses to the solid bars, the vast majority was also non-selective with respect to the motion of the camouflaged bars (Fig. 4C). The few exceptions were cells which showed relatively low, albeit above-threshold, direction or axis of motion biases in response to camouflaged bars (Fig. 5).
The response selectivity of V1 cells to the motion of first- and second-order bars is summarized in Figures 5 and 6. Figure 5 illustrates the comparison of the direction and axis of motion indices obtained through stimulation with solid and camouflaged bars. By virtue of the criteria outlined above, data points corresponding to CI neurons (black symbols) are found only in the shaded quadrants of the graphs. Note, however, that not all data points in these quadrants correspond to CI neurons: for example, neurons showing unidirectional responses to one type of stimulus and bidirectional responses to the other were classified as NCI, as were those which preferred different axes of motion when tested with different stimuli. The data shown in Figure 5 indicate a strong correlation between the DIs obtained with first- and second-order bars (R2 = 0.74), but indicate a much weaker correlation between the AIs (R2 = 0.19) derived from the same tests. This is mostly due to the fact that the axis of motion tuning in response to camouflaged bars tended to be broader than that observed in response to solid bars (Fig. 5B, insert). In Figure 6, the preferred directions or axes of motion revealed through testing with solid and camouflaged bars are compared. For most V1 cells showing selectivity in response to the two types of stimulus, the angular preference proved to be similar: 82% of this sample (36/44 cells) showed a deviation of <30°.
When tested with first-order stimuli, CI neurons proved to be, on average, more strongly direction selective than NCI neurons [Figs 5B (insert) and 7A). For example, while 65% (22/34) of the CI neurons showed unidirectional motion selectivity in response to the solid bars, most NCI neurons (64%, 16/25) were classified as bidirectional. A comparison of the distributions of direction indices among CI and NCI cells demonstrated this to be a significant difference (CI median DI = 0.31; NCI median DI = 0.15; Mann-Whitney test, U′ = 266, z = 2.45, P = 0.014). In contrast (Fig. 7B), the distributions of the axis of motion tuning indices assessed with solid bars were not significantly different (t = 0.32, P = 0.749), even though NCI cells tended to have narrower orientation bandwidths (Fig. 5B, insert).
Laminar Distribution and Receptive Field Category
The reconstruction of the laminar position of each cell was based on a comparison between the depth of the recording site, recorded during the experiment, and the location of the histologically reconstructed tracks. This correlation was aided by the recorded location of physiological transitions, such as the increment of spontaneous activity in layer 4, and the interfaces between cortex, white matter and sulci (Snodderly and Gur, 1995). Based on these criteria, neurons were assigned to one of three layer groups: supragranular [layers 2 and 3 of Hassler (Hassler, 1966); layer 3 including layers ‘4A’ and ‘4B’ of Brodmann’s nomenclature]; granular (layer 4, which corresponds to Brodmann’s layer 4C); or infragranular (layers 5 and 6). Only 2 out of 34 CI units (6%) were located in the granular layer, both in its top half (layer 4α). The remaining 32 CI cells were equally distributed through the supra- and infragranular layers (Table 1). There were higher proportions of layer 4 cells among the NCI (24%, 6/25 cells) and pandirectional (32%, 7/22 cells) categories.
We analysed the neuronal responses to slow-moving black and white solid bars, presented at near-optimal orientation, in order to classify cells as simple or complex (Hubel and Wiesel, 1968). Most of the selective neurons (50/59) could be unambiguously assigned as either simple or complex on the basis of the presence or absence of evident separation between ‘on’ and ‘off’ sub-regions — see Figure 8 and earlier work (Casanova et al., 1995). Nine neurons could not be classified for various reasons, including poor responses to slow-moving stimuli, lack of response to either the black or the white bar, or being tested with bars that proved to be too wide relative to the receptive field. The results of this analysis are summarized in Table 1. Among the 34 CI neurons, there were eight simple cells and 18 complex cells, as well as eight units that could not be reliably categorized. The 25 NCI neurons included 13 simple and 11 complex cells, as well as one non-categorized unit. Thus, although the proportion of complex cells may be higher among CI neurons, there is no clear-cut relationship between the simple/complex cell categorization and the type of selectivity to the motion of second-order stimuli.
Selectivity to Stimulus Speed
Examples of neuronal responses to solid and camouflaged bars as a function of their speed are illustrated in Figure 9B,C. In most cases (54/81 cells, 67% of the sample), V1 neurons behaved in the manner illustrated in Figure 9B, showing a similar speed selectivity whether tested with solid or camouflaged bars. Of the 27 neurons which responded maximally to solid and camouflaged bars moving at different speeds, eight (10% of the total sample) required a camouflaged bar that moved faster than the solid bar speed which elicited the best response, while 19 (23% of the total sample) preferred slower camouflaged bars. This bias is apparent in Figure 9A (points below the dashed line), mostly due to cells with a preference for relatively fast solid bars not responding to camouflaged bars moving at a similar speed. None the less, neurons with similar selectivity to first- and second-order stimuli were observed throughout the range of tested bar speeds (7–86°/s). No significant differences were detected between neurons in the CI, NCI or pandirectional categories with respect to the distribution of speed selectivities, either in response to solid [Kruskal–Wallis analysis of variance (ANOVA), χ2 = 3.10, P = 0.212] or camouflaged (Kruskal–Wallis ANOVA, χ2 = 3.80, P = 0.150) bars. Across each of these categories, most cells showed the strongest stimulus-evoked activity in response to relatively slow moving (<16°/s) stimuli.
Selectivity to Stimulus Length
Figure 10 compares the estimates of optimal length obtained, for each cell, in tests using solid and camouflaged stimuli. This plot reveals a clear bias: in many cases, the maximum response to second-order stimuli requires the use of a camouflaged bar that is considerably longer than the ‘optimal’ solid bar, as determined for the same neuron. None the less, it also demonstrates the existence of a relatively large population of V1 neurons that prefers stimulation with first- and second-order stimuli of similar lengths.
A careful analysis of Figure 10 suggests that CI, NCI and pandirectional neurons differ in terms of their selectivity to the relative length of solid and camouflaged bars, with a larger proportion of CI neurons clustering near the axis of equality (dashed line). In order to test the statistical significance of this observation, we calculated for each cell the ratio between the length of the optimal camouflaged bar and that of the optimal solid bar. On average, we found that the peak response of CI cells was reached with a camouflaged bar that was 32% longer than the optimal solid bar. In contrast, NCI and pandirectional cells required camouflaged bars that were, on average, 86 and 126% longer than the optimal solid bar, respectively. This difference was statistically significant (one-way ANOVA, F = 3.82, P = 0.026). Moreover, while as many as 62% (21/34) of the CI neurons showed maximum responses to solid and camouflaged bars of similar size (estimates of optimal length being within 30% of each other), the corresponding percentages were 48% (12/25) for NCI cells and 27% (6/22) for pandirectional cells.
Estimates of optimal solid bar size were similar among CI, NCI and pandirectional cells (Kruskal–Wallis ANOVA, χ 2 = 2.03, P = 0.363). However, the distribution illustrated in Figure 10 suggests that the optimal solid bar lengths estimated for unidirectional cells (circles) tend to be shorter than those estimated for bidirectional cells (triangles), within the same eccentricity range. This difference attained statistical significance among CI neurons (unidirectional cells, median length 2.7°; bidirectional cells, median length 6.3°; Mann–Whitney test, U′ = 68.5, z = 2.45, P = 0.015), but not among NCI neurons (undirectional cells, median length 3.3°; bidirectional cells, median length 4.9°; Mann–Whitney test, U′ = 48.5, z = 1.53, P = 0.127).
Receptive Field Width
The widths of the excitatory neuronal receptive fields were determined quantitatively, on the basis of response windows measured in PSTHs obtained with bars of near-optimal direction, length and speed. In the vast majority of cases (75/81, 93% of the sample), estimates of receptive field width were very similar (being within 20% of each other), irrespective of whether these were determined using solid or camouflaged bars (see also Figs 3 and 4). The range of receptive field widths in our sample (1.3–10.2°) is in reasonable agreement with previous estimates (0.9–6.0°), based on qualitative mapping of multi-unit response fields in the same part of V1 (Rosa et al., 1997). Receptive field widths were smaller among cells in layer 4 (mean ± standard deviation = 2.50 ± 0.51°), as compared with those in the supra-granular (4.20 ± 2.21°) and infragranular (3.7 ± 1.61°) layers.
Figure 11 illustrates the distribution of the excitatory receptive field widths for the entire sample of neurons. There was a significant between-category difference (one-way ANOVA, F = 7.57, P < 0.001), with CI neurons having receptive fields which were, on average, larger (4.53 ± 2.26°), than those of pandirectional (3.16 ± 1.27°) or NCI (2.98 ± 0.85°) neurons. This difference cannot be explained solely on the basis of laminar biases, as it remains significant if neurons located in layer 4 are excluded (one-way ANOVA, F = 4.87, P = 0.011). While NCI and pandirectional neurons formed relatively homogeneous populations, there was a suggestion (Fig. 11) that CI neurons may include two subpopulations, one with smaller receptive fields in the same range as NCI and pandirectional neurons (<5°) and another with very large (>5°) receptive fields. CI neurons with large receptive fields were found in both supragranular and infra-granular layers.
Strength of Stimulus-evoked Activity and Basal Discharge
With few exceptions, the responses elicited by camouflaged bars of near-optimal characteristics were weaker than those evoked by the corresponding solid bars (Fig. 12A). Across the entire sample, the peak discharge rate evoked by camouflaged bars was, on average, only 55% of that evoked by a solid bar. There was, however, a marked difference between CI, NCI and pan-directional neurons in this respect. While the responses of CI and pandirectional cells to camouflaged bars tended to be relatively strong (CI, 63 ± 24% of the response to a near-optimal solid bar; pandirectional, 65 ± 32%), those of NCI cells were, on average, much weaker (37 ± 26%). These differences were statistically significant (Kruskal–Wallis ANOVA, χ2 = 15.85, P < 0.001). Simple and complex cells differed in the relative strength of response to camouflaged bars (simple cells, 40 ± 28% of the response to solid bar; complex cells, 58 ± 28%; t = 2.26, P = 0.032).
There was also significant variability with respect to the level of neuronal basal discharge in response to the dynamic background pattern alone (Fig. 12B), which was measured during 1 s intervals prior to each trial. In this case, CI neurons showed relatively low basal discharges (median 0.58 spikes/s), while NCI and pandirectional neurons were comparatively more active (medians 1.4 and 1.3 spikes/s, respectively). A Kruskal–Wallis ANOVA revealed these differences to be significant (χ2 = 6.16, P = 0.046).
Timing of the Responses to First- and Second-order Stimuli
The comparison of the timing of the neural responses evoked by first- and second-order stimuli was based on paired trials, in which solid and camouflaged bars of similar, near-optimal characteristics were used as stimuli. This approach allows a robust determination of the relative latency of the responses to each of the two types of stimulus, irrespective of technical issues which may be associated with the determination of the absolute latency of neuronal responses to moving stimuli (Nowak and Bullier, 1997; Mareschal and Baker, 1998; Raiguel et al., 1999). Three measures of response timing were analysed: the PSTH median (the moment at which half of the spikes have been fired in response to a given stimulus); the peak response time; and the response onset time (the moment at which the neuronal response first surpassed 10% of the peak value). As illustrated in Figure 13, the results obtained with these different analyses converge to demonstrate that the V1 neuronal responses to camouflaged bars were significantly delayed relative to those to solid bars. Across the entire sample, the response median was delayed an average of 36.8 ms when camouflaged bars were used as stimuli, in relation to those trials in which solid bars were used. Likewise, the response peak and onset were reached on average 39.1 and 31.0 ms later, respectively, in trials using camouflaged bars. There were no significant differences between CI, NCI and pandirectional neurons, whether the relative latencies of the median (one-way ANOVA, F = 0.05, P = 0.951), peak (F = 1.11, P = 0.335), or onset (F = 0.08, P = 0.92) were compared.
We investigated the selectivity of neuronal responses in primate V1 to bars defined by coherent motion, a powerful second-order visual cue for figure-ground segregation. The results indicate not only that most V1 cells responded to this type of stimulus, but also that, in a large number of cases, their responses were CI: their response selectivities were similar, whether the stimulus was defined by luminance, or by coherent motion alone. Our results illustrate the physiological characteristics of these CI neurons and reveal distinct biases in their ‘classical’ response properties and anatomical distribution. Finally, we found that neuronal responses to second-order (‘camouflaged’) bars were delayed relative to responses to first-order (‘solid’) bars. These observations suggest that, while cue invariance to moving objects is relatively common among V1 cells, the generation of this response property may depend on complex neuronal interactions, involving either feedback from extrastriate areas or slow-conducting intrinsic connections within V1.
Classification of Neurons
While our primary aim was to identify those V1 cells which show similar selectivity in response to first- and second-order bars, we were also interested in testing whether or not these neurons are distinctive with respect to other, ‘classical’ response properties. In principle, specific biases in their responses to first-order stimuli could help clarify the neural circuits underlying cue invariance. Therefore, as a step in the analysis, we classified neurons which showed motion selectivity into two categories, CI or NCI cells, depending on whether or not similar selectivities were revealed in response to solid and camouflaged bars. This binary classification is somewhat arbitrary: by relying on cut-off values for direction and axis of motion indices, and for angular differences for direction selectivity, it fails to address the possibility that V1 cells vary along a continuum. None the less, we feel that the trends revealed by the present analysis demonstrate the important point that cue invariance, at least with respect to contours defined by coherent motion, is not equally distributed among all classes of V1 neurons. For example, in comparison with other V1 cells, CI neurons tended to be more strongly direction selective and to have larger receptive fields. They were also unequally distributed across layers, being rare in layer 4. Finally, neurons deemed CI with respect to motion selectivity also tended to show selectivity for solid and camouflaged bars of similar length, while this tendency was much weaker among NCI and pandirectional cells. Together, these observations point to a neural circuit involving a subset of V1 neurons, which is capable of detecting the movement of an otherwise ‘camouflaged’ object and even retrieving some coarse information about its shape and size. It is also the case that both the CI and NCI categories could comprise more homogeneous subcategories. An example of this comes from the work of Chaudhuri and Albright (Chaudhuri and Albright, 1997), who described cells that responded in a direction-selective manner to first-order bars, but lost direction selectivity when tested with second-order bars (whilst maintaining the same axis of motion selectivity). While the criteria we adopted resulted in such cells being classified as NCI, one could argue that they would be better regarded as an intermediate category.
Comparison with Previous Studies
In cats, it has long been known that V1 cells respond to second-order, motion-induced bars (Hammond and MacKay, 1975, 1977). These experiments, which employed drifting bars filled with both static and dynamic ‘noise’ (the latter resembling our camouflaged bars, except that the pixels alternated between black and white, rather than varying continuously in grey levels), reveal many elements in common with our results, but also some differences. In both the cat and the marmoset, V1 neuronal responses to textured bars tend to be weaker than those to luminance bars of otherwise similar characteristics (Hammond and MacKay, 1977). Moreover, in both species, a large proportion of the cells which respond to second-order bars demonstrate unidirectional selectivity; those cells with markedly unidirectional responses to textured patterns also tend to be selective for short bar lengths (Hammond and Pomfrett, 1989). Finally, our observation that CI neurons are present in both supra- and infragranular layers, but rare in layer 4, is broadly compatible with the results obtained in the cat (Hammond and MacKay, 1977; Edelstyn and Hammond, 1988). As electrolytic lesions were not used to mark recording sites in the present study, in order to avoid disrupting the local interconnections between V1 cells, our histological reconstruction lacked sufficient precision to allow the assignment of cells to specific sublayers. None the less, the prevalence of strong direction selectivity suggests that in marmosets, as in cats, supragranular CI neurons may be concentrated in layer 3c, which provides the bulk of projections to ‘dorsal stream’ areas (Vogt-Weisenhorn et al., 1995).
It has been reported (Hammond and MacKay, 1977) that many cells in cat V1 demonstrated selectivity to different directions of motion when tested with first- and second-order bars. Although the relative incidence of this response pattern was not reported, the authors did note that such cells tended to be located in the infragranular layers. Our results indicate that cells with different selectivity for solid and camouflaged bars are relatively rare (10/81 cells, this number including those units showing unidirectional selectivity to one type of bar and bidirectional selectivity to the other) and do not reveal any laminar bias in their distribution. More commonly, we found neurons which preferred movement in the same axis or direction of motion (34/81 cells), or which lacked selectivity to the parameters of motion of camouflaged bars, while demonstrating selectivity to solid bars (15/81 cells). In this respect, our observations in the marmoset closely parallel those of Chaudhuri and Albright (Chaudhuri and Albright, 1997) in the behaving macaque monkey. Together, these results suggest that a substantial number of V1 neurons, both in New World and Old World monkeys, can convey correct directional information about motion-defined camouflaged bars, and that the computations responsible for this analysis are operating in both conscious and anaesthetized primates. However, data in both species also reveal broader direction tuning in response to second-order bars than that assessed with first-order bars. Studies in which cat V1 cells were stimulated with drifting full-field (background) textured patterns also usually report wider direction bandwidths, in comparison with those assessed using bars or gratings (Casanova et al. 1995).
Initial reports in cat V1 emphasized that only complex cells respond to moving textured patterns, including second-order drifting ‘noise’ bars (Hammond and MacKay, 1975, 1977). This is difficult to reconcile with the fact that each of the 81 cells in our sample responded, to a greater or lesser degree, to the camouflaged bars. One possible criticism is that the textural elements used in our experiments (12’ arc) could have been large enough to act as a first-order stimulus for the simple cells. However, there are several reasons why we believe this is extremely unlikely. First, we explored the peripheral (>10° eccentricity) visual field, where receptive fields are large; even the smallest receptive field in our sample was large enough to encompass several textural elements across its width (1.3°). Second, given the calculated Nyquist frequencies of the marmo-set retina (Wilder et al., 1996), the dynamic textural elements were probably just large enough to be visible in that region of the visual field, especially given that they were presented in the context of a ‘busy’, contrast-matched background. Third, the randomized assignment of grey levels to each textural element, and their continuous variation, eliminated the possibility of any consistent pattern being generated across trials. Finally, similar to our findings, Chaudhuri and Albright (Chaudhuri and Albright, 1997) concluded that, in the behaving macaque, both simple and complex cells responded to moving second-order bars. More recent studies in the cat have also disputed the view that only complex cells respond to second-order moving visual patterns (Casanova et al., 1995; Mareschal and Baker, 1998).
Leventhal and collaborators (Leventhal et al., 1998) studied the neuronal responses in areas V1 and V2 to moving stimuli defined by various first- and second-order cues. These included not only bars defined by luminance and coherent motion, similar in many respects to those we used, but also those defined by differences in textural grain, by illusory contours, or by combinations of more than one cue. In agreement with our observations, Leventhal and collaborators reported that CI neurons tended to have relatively large receptive fields, in comparison with those of NCI neurons. However, they also reported that such cells were extremely rare in V1 of both cats and macaques. Moreover, even in V2, the neuronal response selectivity to bars defined solely by motion appeared extremely weak in all illustrated examples; typically, clear selectivity depended on the foreground object also differing from the background in terms of textural density [Figs 5–7 of Leventhal et al. (Leventhal et al., 1998)]. In a few cells in which we manipulated the textural grains of the foreground and background patterns independently, this effect was not obvious (Fig. 14). One possible explanation of these discrepancies is that the neuronal computations responsible for the responses of V1 cells to stimuli defined by coherent motion are particularly sensitive to halothane anaesthesia (used by Leventhal et al.), but comparatively less affected by sufentanil. Another possibility relies on the fact that their experiments specifically sought cells with CI responses to several different types of stimulus, while in the present study the same battery of tests (consisting of only two types of stimulus) was applied to every well-isolated unit. It may be that, in V1, cells which respond to moving bars that match the background’s texture and luminance do not show cue invariance with respect to other types of stimulus (e.g. illusory contour induced). Thus, such cells may have been under-represented in Leventhal and collaborators’ (Leventhal et al., 1998) sample. Indeed, it is not necessarily the case that there is a single category of CI neurons in V1 or V2, as different populations of cells may show cue invariance with respect to different combinations of image cues (Chaudhuri and Albright, 1997; Ramsden et al., 2001).
Effect of Textured Background on Length Selectivity
The present work provides an interesting contrast with experiments in which first-order stationary bars were presented either alone or in the context of different background patterns (Knierim and Van Essen, 1992; Kapadia et al., 1995, 1999; Nothdurft et al., 1999; Hupé et al., 2001b). Among the prominent background effects which have been described in V1 of behaving macaques is a change in the neuron’s length summation properties when the static bar is made part of a textural array (Kapadia et al., 1999). In this situation, most neurons display length summation over larger regions of the receptive field, in comparison with the situation in which the bar is presented alone. Thus, reducing the salience of the bar by the introduction of a ‘cluttered’ background appears to have a similar effect to lowering its contrast (Sceniak et al., 1999). Our results are qualitatively similar, given that the optimal camouflaged bars (which were arguably ‘less salient’) tended to be longer, on average, than the optimal solid bars determined for the same cells. However, there may be quantitative differences between the results obtained with the two paradigms. While Kapadia and collaborators (Kapadia et al., 1999) reported that the optimal bars were, on average, 2.12 times longer in the textured background condition, in our experiments the effects were less dramatic: optimal camouflaged bars were, on average, 1.76 times longer than the optimal solid bars (1.55 times, if pandirectional cells are excluded). Moreover, while in the former study the change in length summation properties was evident in nearly every cell, our data (Fig. 10) revealed a substantial population of V1 cells that prefer solid and camouflaged bars of similar lengths, including most neurons classified as CI on the basis of motion selectivity. Finally, unlike in our data, Kapadia et al. (Kapadia et al., 1999) observed no change in response latency when the stimulus was made part of a textured pattern, even though the response median was delayed. It remains to be determined whether these discrepancies are solely due to methodological issues (e.g. the use of behaving versus anaesthetized monkeys, or different species), or constitute actual physiological differences in the response pattern of V1 cells in the different situations. Thus, our results support the view that V1 neurons are capable of summating excitatory inputs over different areas, depending on the situation (Sceniak et al., 1999), and confirm that summation tends to occur over smaller areas if the stimuli are made more salient. This could be seen as an aspect of ‘instantaneous’ plasticity, as if the receptive fields expanded or shrank in response to the functional requirements at a given moment and in a particular part of the visual field. However, it is possible that different neural mechanisms may be engaged depending on the exact combination of cues defining the stimulus. In particular, coherent motion may be powerful enough to generate a reasonably high level of stimulus visibility, even in the absence of other cues. Thus, many V1 cells appear able to operate within similar spatial parameters, independent of whether a combination of luminance and motion (as in the case of solid bars) or coherent motion alone (camouflaged bars) defines the object’s boundaries.
Mechanism of Generation of CI Responses in V1
In the macaque monkey, it has been demonstrated (Hupé et al., 1998) that feedback from extrastriate area MT influences the responses of neurons in areas V1, V2 and V3 (third visual area) to ‘solid’ bars moving across a background comprising a non-modulated, black/white texture. Their observations indicate that feedback connections may contribute to figure–ground segregation, by enhancing the responses of cells in these areas to the foreground object; moreover, this contribution becomes more substantial in situations of ‘low salience’, in which a moving bar and the background are similar. In their experiments, low salience was created by reducing the luminance contrast between the foreground bar and the background, or by introducing coherent movement of the foreground and background — ‘the bar is barely visible when both bar and background are stationary or moving coherently. The movement of the bar on the stationary background makes it clearly visible’ (Hupé et al., 1998). The camouflaged bars used in the present experiments can be seen as an extreme example of a low-visibility stimulus, in which there is no luminance contrast between the stimulus and background; only coherent motion makes it visible. One can therefore hypothesize that the V1 neuronal responses to this type of stimulus could also be enhanced by feedback from MT. Moreover, this excitatory drive would probably come from neurons that are direction selective, thereby helping to explain the cue invariance observed in the present experiments. The fact that some CI neurons show bidirectional motion selectivity suggests that other extrastriate areas may also be sending feedback that is relevant for sharpening the response selectivity of V1 cells to second-order stimuli. These may include the dorsomedial area (DM), where cells with uni- and bidirectional responses are represented in nearly equal proportions (Rosa and Schmid, 1995), as well as V2. The suggestion that V1 neuronal responses to stimuli defined by coherent motion reflect the activity of feedback pathways is also relevant in the context of explaining the responses of simple cells to camouflaged bars. Such responses could be influenced by feedback from complex cells in extrastriate cortex, capable of integrating the activity of many elementary motion detectors — the ‘second array’ of two-layer motion analysis models (Zanker, 1997). Finally, there are indications that feedback connections may be particularly sensitive to the details of the preparation, including the type and dose of anaesthetic (Lamme et al., 1998). Thus, methodological differences could explain some of the divergent results in the literature regarding the responsiveness of simple cells to textured patterns; this is reviewed elsewhere (Casanova et al., 1995).
Our observation that responses to camouflaged bars are delayed relative to responses to solid bars is also compatible with the notion that the former depend on further computational steps, involving feedback from extrastriate cortex (Zipser et al., 1996; Nothdurft et al., 1999). Indeed, cells with CI properties were rare in layer 4, where terminations of feedback axons are absent or extremely sparse (Rockland, 1997). Moreover, our estimates of the delay in the V1 responses to camouflaged bars (30–40 ms) are compatible with recent estimates of the time-frame of feedback effects of MT onto V1 (Bullier, 2001; Pascual-Leone and Walsh, 2001). However, tracing a parallel between delayed responsiveness and feedback connections is difficult without direct tests, such as reversible inactivation. As demonstrated by recent results in the macaque, the modulatory effect of feedback connections can be extremely rapid (Hupé et al., 2001a,b), being manifest within <10 ms. Thus, there may be no causal relationship between feedback and the delay we observed. Moreover, lowering the stimulus intensity is known to increase the latencies of neuronal responses (Levick, 1973). It is therefore possible that the response delay to camouflaged bars reflects, in part, a longer integration time needed to achieve suprathreshold membrane potentials in V1, due to the less correlated firing induced by these stimuli in the afferent layers. The extent to which our observations can be explained solely on the basis of this factor remains unclear. Although some studies have observed contrast-related changes in latency without concomitant reductions in response strength (Gawne et al., 1996), others have reported that marked increases in latency caused by attenuation of first-order stimuli are accompanied by conspicuous reductions in neuronal response (Maunsell and Gibson, 1992). In our data, the responses of CI neurons to moving camouflaged bars were usually only moderately reduced — and in some cases of similar magnitude — when compared to those elicited by solid bars.
Finally, it is now realized that the effects of long-range intrinsic connections within V1 can be significantly delayed in comparison with those arising from feedforward processing, in part due to the slow-conducting nature of the horizontal axons that form this pathway (Bringuier et al., 1999; Girard et al., 2001). One can imagine that the detection of local motion gradients in different parts of the visual field is followed by integration within V1, via excitatory horizontal axons. In particular, a subpopulation of direction-selective neurons with long receptive fields, such as those described in layer 6 (Gilbert, 1977), would be in a good position to perform this integration; via interlaminar feedback, they could introduce or sharpen response specificity to moving textured patterns. Such a feedback loop is a prominent feature of some models which simulate perceptual grouping by cortical neurons (Ross et al., 2000).
In summary, although our observations may suggest that excitatory feedback from ‘dorsal stream’ areas boosts V1 neuronal responses to stimuli defined by coherent motion and introduces a degree of CI stimulus specificity, they alone cannot prove that this is the case. However, they do raise some specific predictions to be tested by future experiments. If this hypothesis is correct, one would expect that inactivation experiments of the type conducted by Hupé et al. (Hupé et al., 1998) will result in reduced responsiveness of V1 neurons (and, in particular, simple cells) to camouflaged bars, and lead to a loss of CI stimulus specificity.
|Supragranular (n = 29)||Granular (n = 15)||Infragranular (n = 37)|
|CI (n = 34)||16 cells (9 complex, 3 simple, 4 undetermined)||2 cells (1 simple, 1 undetermined)||16 cells (9 complex, 4 simple, 3 undetermined)|
|NCI (n = 25)||9 cells (3 complex, 6 simple)||6 cells (2 complex, 3 simple, 1 undetermined)||10 cells (6 complex, 4 simple)|
|Pandirectional (n = 22)||4 cells||7 cells||11 cells|
|Supragranular (n = 29)||Granular (n = 15)||Infragranular (n = 37)|
|CI (n = 34)||16 cells (9 complex, 3 simple, 4 undetermined)||2 cells (1 simple, 1 undetermined)||16 cells (9 complex, 4 simple, 3 undetermined)|
|NCI (n = 25)||9 cells (3 complex, 6 simple)||6 cells (2 complex, 3 simple, 1 undetermined)||10 cells (6 complex, 4 simple)|
|Pandirectional (n = 22)||4 cells||7 cells||11 cells|
Funded by research grants from the Australian Research Council (A09937020) and the National Health and Medical Research Council (990007). Equipment support from the Clive and Vera Ramaciotti Foundation and ANZ Charitable Trust is gratefully acknowledged.