There are notable differences in functional properties of primary visual cortex (V1) neurons among mammalian species, particularly those concerning the occurrence of simple and complex cells and the generation of orientation selectivity. Here, we present quantitative data on receptive field (RF) structure, response modulation, and orientation tuning for single neurons in V1 of the tree shrew, a close relative of primates. We find that spatial RF subfield segregation, a criterion for identifying simple cells, was exceedingly small in the tree shrew V1. In contrast, many neurons exhibited elevated F1/F0 modulation that is often used as a simple cell marker. This apparent discrepancy can be explained by the robust stimulus polarity preference in tree shrew V1, which inflates F1/F0 ratio values. RF structure mapped with sparse-noise—which is spatially restricted and emphasizes thalamo-cortical feed-forward inputs—appeared unrelated to orientation selectivity. However, RF structure mapped using the Hartley subspace stimulus—which covers a large area of the visual field and recruits considerable intracortical processing—did predict orientation preference. Our findings reveal a number of striking similarities in V1 functional organization between tree shrews and primates, emphasizing the important role of intracortical recurrent processing in shaping V1 response properties in these species.
The primary visual cortex (V1) is arguably the best understood of the sensory cortices and still remains the focus of considerable experimental as well as theoretical work. In examining neural responses to visual stimuli, a distinction between simple and complex cell responses is commonly made by many investigators. While there has been some debate concerning the classification criteria and even the existence of the distinction itself (Kagan et al. 2002; Mechler and Ringach 2002; Priebe et al. 2004; Wielaard and Sajda 2006), 2 main parameters often exhibit bimodal distributions and are thus commonly used for distinguishing simple and complex cells. These parameters are (1) the spatial overlap between parts of the receptive field (RF) responsive to bright and dark stimuli and (2) the temporal modulation of the response to a drifting sinusoidal grating that is quantified by the F1/F0 ratio. Simple cells exhibit a characteristic low subfield overlap and high F1/F0 ratio parameters, such that bright and dark stimuli activate distinct parts of the RF and the response is strongly dependent on the phase of a drifting grating stimulus. Notably, the distributions of both the overlap index (OI) and the F1/F0 ratio vary considerably between different species. In carnivores such as cat (Hirsch and Martinez 2006a) and ferret (Alitto and Usrey 2004), as well as in rodents such as mouse (Niell and Stryker 2008) and rat (Burne et al. 1984; Girman et al. 1999), there tend to be similar numbers of simple and complex cells. In contrast, simple cells are more rare in macaque monkey V1 (Hubel and Wiesel 1968; Kagan et al. 2002; Ringach et al. 2002; Mata and Ringach 2005; Chen et al. 2009) and particularly cells with low overlap indices and thus well separable RF subfields are notably infrequent. Given these differences between species, we were interested here to quantify the OI and the F1/F0 ratio, as well as a number of additional RF parameters, in V1 of the tree shrew, a small day active mammal that is a close relative of primates.
The different proportion of simple and complex cells observed in various species is of particular interest, because simple cells are thought to provide the basis for orientation selectivity in V1 (Hubel and Wiesel 1962). Particularly in carnivores, the orientation preference of single neurons can be accurately predicted by the spatial structure of the RF (Chapman et al. 1991; Lampl et al. 2001; Jin et al. 2011), which is generally estimated using the sparse-noise stimulus. Because sparse-noise consists of briefly flashed small spots of light that cover only a small part of the visual field, they emphasize feed-forward thalamocortical inputs to V1 neurons (Yeh, Xing, Williams, et al. 2009). We were thus interested in investigating whether sparse-noise mapped RF structure was related to orientation preference in the tree shrew V1. Previous findings indeed suggest that this might not be the case, because intracortical processing appears to contribute more strongly to orientation selectivity in the tree shrew than is the case in carnivores (Chisum et al. 2003; Mooser et al. 2004; Bhattacharyya et al. 2012). We therefore, in addition, estimated RF structure using a second stimulus set, the Hartley subspace stimulus, consisting of gratings of different orientations, phases, and spatial frequencies. Individual Hartley subspace stimuli typically extend well beyond the classical RF unlike sparse-noise stimuli and can thus be expected to recruit considerably more intracortical processing.
We find that orientation selectivity in tree shrews can indeed be predicted from RF structure determined using Hartley subspace, but not sparse-noise stimuli, which is consistent with intracortical mechanisms playing a dominant role in shaping selectivity in this species. Moreover, sparse-noise mapping revealed that simple cells were very rare in the tree shrew V1, particularly when RF subfield overlap is used as the criterion. We argue that simple cell incidence based on the F1/F0 modulation ratio is overestimated as a result of robust stimulus polarity preference that characterizes tree shrew V1.
Materials and Methods
Experiments were performed on 16 adult tree shrews (7 females and 9 males, 150–280 g) (Tupaia belangeri), aged 3–8 years. Animals initially received Ketanarkon (i.m. 100 mg/kg) and Atropine (i.m. 0.02 mg/kg). We then performed a tracheotomy and administered the muscle relaxant Pancuroniumbromide (i.p., initial dose of 0.4 mg/kg, then 0.2 mg/kg approximately every 45 min). Animals were artificially respirated at 100 strokes per minute (Harvard Instruments Respirator) using a mixture of 70% N2O and 30% Oxycarbon (95%O2/5%CO2) and Isoflurane (0.5–1.5%). The animal was then transferred to a stereotaxic device (David Kopf) that was modified to permit visual stimulation. Animals wore contact lenses to prevent drying of eyes. A drop of atropine was applied to the eye for pupil dilation. Visual stimulation was monocular, and the other eye was covered. To gain access to the primary visual cortex, the temporal muscle was removed, the bone cleaned, and a hole (∼2–3 mm diameter) was drilled around anteroposterior (AP) −1 mm and mediolateral +4 mm. A small slit was made in the dura using a syringe needle to permit introduction of the tetrodes into the cortex. After tetrode placement, the cortex was covered with lukewarm 2% agarose (in 0.9% NaCl) to prevent drying and to provide stability. During the entire experiment, body temperature was maintained at 37°C via an electric heating pad controlled by a rectal thermal probe. The heart rate was monitored using a Tektronix differential amplifier, with typical values ranging from 380 to 420 beats per minute. All procedures were conducted according to local regulations approved by the veterinary office of the Canton of Fribourg and in compliance with European Union directives.
Tetrodes were fabricated as described in Nguyen et al. (2009), by twisting together four 12.5-µm diameter nickel–chromium wires (RO-800; Kanthal Precision Technology) for a total diameter of 25–35 µm and impedances were reduced to 200–300 kΩ by gold plating. Two or three tetrodes spaced between 100 and 2800 µm apart were advanced into the primary visual cortex using a hydraulic or manual microdrive (David Kopf Instruments). Identical penetrations were made in all experiments close to normal to the cortical surface by tilting the microdrive back at an angle of approximately 30°. For a given penetration, we recorded activity at multiple depths, typically spaced around 200 µm apart. The signal was amplified by an RA16PA Medusa preamplifier and then filtered and digitized by a RZ5 Bioamp Processor (Tucker-Davis Technologies, Alachua, FL, USA). To estimate spiking activity, we thresholded signals, filtered between 300 Hz and 4 kHz, and sampled at 24.4 kHz, on each tetrode by using the channel with the largest signal-to-noise ratio. As is true for all extracellular recordings, spiking activity might be dominated by activity from large pyramidal neurons. The spike waveforms were sorted into clusters offline, using the public domain software MClust 3.5 using the features peak, trough, energy, and time of the recorded waveforms. Clusters were designated as single units if the cluster was well separated from noise and interspike intervals were >2 ms.
At the end of a recording session, reference lesions were made at multiple depths using a constant current stimulator (WPI A360) passing 10 µA for 10 s. Animals were perfused through the heart with 0.9% NaCl, followed by ice cold 4% paraformaldehyde in 0.1 M phosphate buffer (pH 7.4). The top of the skull was removed, and a coronal cut was made through the brain at AP +4 mm in the stereotaxic frame. The brain was then removed and immersed in a mixture of 2% dimethyl sulfoxide and first 10% and later 20% glycerol in 0.1 M phosphate buffer (pH 7.4). The posterior part of the brain was then cut into 50-µm sections using a freezing microtome (Microm HM440E). The lesions were located in Nissl- or cytochrome–oxidase-stained coronal sections and were used to verify the recording locations and to calculate tissue shrinkage due to processing of the tissue. From this we assigned each recorded cell to either supragranular, granular, or infragranular layers and calculated relative cortical depths (Hawken et al. 1988). The laminar boundaries in the plots correspond to the population average top and bottom of the granular layer. Laminar locations were assigned only for neurons for which the recording location could be reliably estimated (85 of 120), and only such recording locations are plotted in the laminar analyses (Fig. 3).
Stimuli were generated with the Psychophysics Toolbox running on a Mac Mini and presented on a gamma corrected 21″ diameter (∼56.7° visual angle) Compaq Qvision 210 cathode ray tube monitor running at 119.22 Hz. Maximum luminance measured with a Minolta TV-color analyzer was determined as 50 cd/m², and “white,” “gray,” and “black” values were adjusted to have 100%, 50%, and 0% of this value, respectively. Before recording, we mapped the approximate location of the RF of the neurons under study by manually sliding bars generated with a simple graphics program back and forth on the monitor. The stimulus was then positioned in this area at eccentricities between 10° and 35°. We consider the results based on this eccentricity range to be representative for the entire visual field, since the tree shrew retina does not contain a fovea with a different composition of rods and cones.
For RF mapping, we employed the sparse-noise stimulus, consisting of a sequence of randomly positioned luminance increment (white) and luminance decrement (black) squares, 1.3° × 1.3° in diameter, over a gray background with a 50% overlap between adjacent positions. The effective resolution of this mapping procedure was thus 0.66°. This pixel size is chosen to ensure robust responses of the neurons and is small enough to reveal the RF shape also for relatively small RFs. The smallest RFs found still have a diameter of at least twice the diameter of the sparse-noise pixel. We generally employed a 15 × 15 position grid covering 10° × 10° of the visual field; rarely 21 × 21 or 25 × 25 position grids with the same effective resolution of 0.66°. To ensure even sampling, we discarded data from the outer 0.66° wide rim of the mapped area. Each black or white square was presented for a total duration of about 83.3 ms, corresponding to 10 video frames at 120 Hz. There was no blank period between subsequent stimuli. The sequence of all stimuli was repeated 10 or 5 times in different pseudorandom orders.
For a subset of recording locations, we also employed the Hartley subspace stimulus (Ringach et al. 1997), which is made up of static sine-wave luminance gratings with different orientations, phases, contrasts, and spatial frequencies. We always chose the stimulus to be large enough to cover all simultaneously recorded RFs, ranging from 10° to 30°. Orientations were uniformly distributed with either 22.5° or 11.25° spacing. We typically showed 3–5 different spatial frequencies ranging from 0.07 to 0.6 cycles/°, up to 3 different contrasts levels between 10% and 100% of total luminance range, and 2 or 4 phases spaced at pi or pi/2. Stimuli were again presented for 10 video frames, without intervening blank periods for 3–10 repetitions in a different random order.
For a subset of recording locations, we also used sinusoidal drifting gratings, which were presented at a fixed, manually determined optimal speed (1–3 cycles per second) and spatial frequency (0.03–0.7 cycles/°) and chosen to be large enough (10–30°) to cover all simultaneously recorded RFs. Each stimulus was shown either for 3 s with a 2-s blank period between 2 grating presentations or for 2 s with a 1-s interstimulus blank period. Spontaneous firing rates were determined from the interstimulus intervals and had average values of 2.6 ± 0.45 spikes/s. We showed 1–4 different contrast conditions (between 10% and 100%) and 8 drift directions spaced uniformly at 45° intervals—corresponding to 4 different orientations, each drifting in 2 opposite directions.
Receptive Field Mapping
RFs were estimated for each unit by weighting each square on the grid with the average firing rate evoked by its onset in a window from 20 to 100 ms after stimulus onset. RFs were calculated separately for trials with black and white stimuli. Oriented 2-dimensional Gaussian functions were fit to the resulting activation maps:
The Hartley stimulus RF map was calculated exactly like the sparse-noise RF map, by summation over the visual stimuli weighted by the number of spikes fired in response to each stimulus. Gaussian distributions were fitted to the resulting Hartley RF estimates in the same way as the sparse-noise RF estimation.
An OI was calculated as in Schiller et al. (1976); Kagan et al. (2002); Martinez et al. (2005) as , where and are the average of σx and σy for the white and black field, respectively, 2σ the radius of the RF, and the Euclidian distance between the black and white RF centers. The OI has a value of one for perfectly overlapping subfields, decreases to zero for tangential subfields, and becomes negative for subfields separated in space.
We calculated the white/black ratio similar to Yeh, Xing, Shapley (2009) as , the logarithmic ratio of the average firing for “white” compared with “black” stimulus trials. This ratio is zero for equal activity and the more negative it is, the more the “black” response dominates the “white.” We define the absolute value of this ratio, , as the polarity preference. Large values indicate that the unit has a robust preference for one contrast polarity over the other.
Orientation tuning was calculated from the responses to the Hartley stimuli using only the preferred spatial frequency and contrast level, but taking the average across the phases.
Preferred orientation, as well as tuning strength, was extracted from the responses using an orientation selectivity measure that relies on vector summation (Naito et al. 2007):
The vectors of the responses R(θi) to each orientation θi are added up in the complex plane and then normalized by the sum of all responses. The OSI will have a value between 0 for untuned and 1 for perfectly tuned responses. The angle of the average vector denotes the preferred orientation of the recording location. To determine whether a recording location was significantly tuned for stimulus orientation, we used a Monte Carlo resampling technique. We shuffled all trials so that they would be assigned to random orientations and recalculated the OSI as described above. We repeated this procedure 10 000 times with random trial permutations and took the upper 1% value as threshold for significance.
The relative temporal modulation of a cell in response to a drifting sine-wave grating is usually measured using the ratio of the amplitude of the modulated component of the response (F1) to its average firing rate (F0). If a cell's modulation amplitude exceeds its average firing rate, it is classified as a simple cell (F1/F0 > 1). We calculated the F1/F0 ratio based on the responses to drifting grating stimuli at optimal orientation and contrast. We fitted a sinusoid to the response after subtracting the average spontaneous activity measured in the interstimulus intervals and binning to 50 ms bins, excluding the onset transient. The frequency of the sinusoid was fixed at the drift frequency of the grating. F1 then is the amplitude of the best fit, and F0 its offset from zero.
We present here data collected from the V1 of 16 tree shrews, where we recorded spiking activity from at a total of 120 single neurons at eccentricities between 10° and 35° spread throughout the cortical layers. We generally recorded simultaneously from 2 or 3 cortical sites at similar depths. Single neuron activity was isolated using tetrode recording, as illustrated for an example unit in Figure 1A. We proceed by first describing properties of the RFs for V1 neurons, estimated using a sparse-noise stimulus (Fig. 1B, see also Materials and Methods). We then examine the selectivity of neural responses, by quantifying orientation tuning estimated using drifting as well as static grating stimuli and assignment of neurons to the simple or complex category. Finally, we relate spatial RF properties to orientation tuning and the temporal neural response modulations.
Spatial Selectivity Analysis
To quantify the RF shape, we generated pseudocolor maps based on the mean firing rate at each location on a rectangular grid, where a sparse-noise pixel was shown (see Materials and Methods). In Figure 1B, a resulting RF map recorded in the supragranular layer (at ∼0.34 relative cortical depth) is shown separately for black and white stimuli. Two-dimensional, oriented Gaussians were fitted to each RF map, allowing us to compute long and short axis RF spread parameters σl and σs as well as the position and tilt of the RF. The RFs were not circular but elongated, as evidenced by σ1/σs aspect ratios of around 1.6, and exhibited a near horizontal tilt. The center positions of the 2 Gaussian fits were highly similar to each other (horizontal: 15°, vertical: 7°, eccentricity: 16°), resulting in an OI (see Materials and Methods) of 0.92 between RFs for black and white stimuli. Finally, while the RF size was similar for black and white RFs, black stimuli elicited greater activity than white ones, as evidenced by a negative white/black ratio (see Materials and Methods) of −0.64.
To examine RF parameters across the population, we first estimated the mean RF size. Note, that because 95% of the Gaussian distribution lies within ±2σ of the mean value, we estimate the RF size as the mean RF diameter for comparability with hand-mapped RF estimates, and plotted it against eccentricity in Figure 2A. Note that RFs could be well approximated by a Gaussian fit for 84, 112, and 77 of the 120 neurons for white, black and stimuli of both polarities, respectively. RF size varied between approximately 2° and 12° of visual angle and was positively correlated with eccentricity for both black and white stimuli (rblack = 0.37 and rwhite = 0.36, P << 0.01). We found no significant difference in the RF size for black and white stimuli (paired t-test, P = 0.09). Across the population, black stimuli elicited greater activity than white stimuli as evidenced by a negative average white/black ratio (Fig. 2B) of −0.51. Consistent with this, spiking activity for 96 of 120 neurons was greater for black stimuli compared with only 24 of 120 white-dominant neurons. We found a large overlap between black and white stimulus RFs (Fig. 2C), with unimodally distributed data (Hartigan Dip test: P = 0.99, 10 000 bootstraps) and a mean OI of 0.82. Examining the aspect ratio, we found similar values for black and white stimuli (P = 0.95) as well as for cortical layers (P = 0.67) according to a 2-way analysis of variance (ANOVA). The average aspect ratio was 1.33. The tilt of the elongated RF in the visual space was distributed highly nonuniformly for both black and white stimuli (Rayleigh test, P << 0.001). The longer radius of the ellipse tended to lie close to horizontal, as shown in Figure 2D, with 67 of 112 (60%) and 48 of 84 (57%) of units having a tilt within ±30° of the horizontal for black and white stimuli, respectively. Note that for a uniform distribution, only 33% of tilts would lie in this range.
Orientation Selectivity Analysis
Here, we focus on orientation selectivity, assessed at 87 recording sites using briefly flashed static sinusoidal gratings of different orientations and spatial frequencies (see Materials and Methods). We found significant orientation selectivity (Monte Carlo resampling: P < 0.05, see Materials and Methods) for 35 (40%) of a total of 87 single units with a mean OSI of 0.41 ± 0.04. Computing the TW for this population of tuned single units, we observed an average TW value of 32.5 ± 2.3°. Recording locations were reconstructed based on electrolytic lesions and cytochrome–oxidase histochemistry (see Materials and Methods), as illustrated in Figure 3A showing a lesion in the granular layer, near the boundary of the supragranular layers. Orientation selectivity values were similar between the cortical layers: Supragranular (OSIsg: 0.32), granular (OSIg: 0.29), and infragranular (OSIig: 0.21; ANOVA, P > 0.1), as shown in Figure 3B. However, highly tuned neurons (OSI >0.5) tended to occur more frequently in the supragranular (4 of 12) than in the granular (0/8) layer. We compared orientation selectivity measured with these flashed static gratings to that of drifting gratings for a population of 60 neurons for which both data were available. As expected, OSI values between the 2 stimuli were highly correlated (r: 0.66, P << 0.01), and the strength of orientation tuning was also similar (paired t-test: P = 0.36). Finally, we wanted to know if the preferred orientation of the units was also similar between the 2 stimulus sets. We tested whether the distribution of angular differences between the preferred orientations for the 2 stimuli differed significantly from zero. A t-test confirmed that preferred orientations were indeed similar for drifting or a briefly flashed grating stimuli (P = 0.61).
Simple/Complex Cell Dichotomy
We used the subfield OI and the temporal modulation ratio (F1/F0) derived from the drifting grating stimulus to identify simple and complex cells in tree shrew V1. The F1/F0 ratio was unimodally distributed (Hartigans Dip Test: P = 0.96, 10 000 bootstraps) and >1.0 for a minority (32 of 76 or 42%) of neurons, which is indicative of simple cells. The laminar distribution of the F1/F0 ratio is shown in Figure 3C for neurons with reliable depth estimates. We observed that 7 of 17 (41%) of granular layer neurons had F1/F0 >1, compared with only 5 of 19 (26%) in the supragranular layer (χ2 test: P = 0.07). However, simple cells according to the F1/F0 criterion were found in all cortical layers. The laminar distribution for the other defining criterion of simple cells, the OI, is shown in Figure 3D. Neurons with small OI values (OI <0.5), putative simple cells, were rare (7%) and also present in all cortical layers. We note that the fraction of neurons assigned to the simple cell category differs significantly depending on which criterion is used for the definition (OI: 7%, F1/F0: 42%, χ2 test: P < 0.01). Interestingly, a neurons' orientation selectivity was well predicted by its OI value (Fig. 4A, r: −0.45, P << 0.01), but not by its F1/F0 value (Fig. 4B, r: 0.14, P = 0.27) or its polarity preference (Fig. 4C, r: 0.1, P = 0.38). On the other hand, there was a correlation between the F1/F0 value and polarity preference (Fig. 4F, r: 0.28 P < 0.05), indicating that neurons with a strong preference for one stimulus polarity also exhibit strong temporal modulations to drifting gratings. Neurons with segregated subfields—and thus small OI values—were highly tuned not only for orientation, but also exhibited elevated polarity preference and temporal modulations (Fig. 4D,E).
Relation Between Spatial and Orientation Selectivity
Here, we examine the relation between spatial RF parameters measured using the sparse-noise stimulus and orientation preference. Two examples illustrating this relationship are shown in Figure 5A,B. Both neurons had horizontally elongated sparse-noise RFs and are thus representative of the population in this respect (Fig. 2). One of the example neurons was well tuned for orientation (Fig. 5A), but had a preferred orientation that differed considerably from the RF tilt; whereas the second example neuron was untuned for orientation in spite of a highly elongated sparse-noise RF map (Fig. 5B). These 2 examples thus suggest that sparse-noise RF structure is a poor predictor of orientation selectivity. This was indeed the case for the entire population: There was no correlation between the aspect ratio of the mapped RFs and the orientation selectivity index for significantly orientation-tuned neurons (r: 0.13, P = 0.47). Examining the relationship between the sparse-noise RF tilt and orientation preference, shown in Figure 5C, we found that RF tilt was nonuniformly distributed for black and white stimuli (Rayleigh test, P << 0.001). Of orientation-tuned units, 22 of 33 (67%) and 12 of 22 (55%) had RF tilts within ±30° of the horizontal for black and white stimuli, respectively, as highlighted by shading in Figure 5C; a higher proportion than the 33% that would be expected for uniformly distributed data. In contrast to the RF tilt, preferred orientation was uniformly distributed (Rayleigh test, P > 0.1) and, furthermore, was also uncorrelated with sparse-noise RF tilt (circular r = 0.15, P = 0.26). A model for the generation of orientation preference from the RF structure of simple cells predicts the preferred orientation to be along the axis orthogonal to the connection line between the black and white RF subfield centers. We observed no correlation between this predicted preferred orientation and the actual orientation preference of the neuron measured with drifting gratings (circular r = 0.16, P = 0.43; see Supplementary Fig. 1). Taken together, RF structure estimates based on sparse-noise poorly predict orientation tuning. Similar to sparse-noise, the Hartley stimuli also form a set of basis functions that can be used to describe spatial aspects of the RF. We thus proceeded to examine whether spatial RF structure estimated using the flashed grating Hartley stimulus set (see Materials and Methods) yielded more reliable estimates of orientation preference. This was indeed the case, as demonstrated by 2 representative single neuron examples, shown in Figure 6A,B. In both cases, RF structure correlated well with orientation preference measured using drifting grating stimuli. This is illustrated by the correspondence between the orientation tuning function and the Gaussian fit to the Hartley RF map. Indeed, OSI values were correlated with aspect ratios of the Hartley RF map (r = 0.22 and P < 0.05), such that neurons with more circular Hartley RF maps were less well tuned for orientation. The tilt of the Hartley RF map served as a good predictor of orientation preference, as shown in Figure 6C, with 22 of 31 (71%) and 11 of 21 (52%) of orientation-tuned units having preferred orientations within ±30° of the black and white Hartley RF tilts, respectively, as highlighted by shading in Figure 6C. In addition, there was a significant circular correlation between preferred orientation and Hartley RF tilt (circular r = 0.41, P < 0.01), further supporting a close correspondence between orientation tuning and Hartley RF structure estimates. Notably, neurons with classical simple cell Hartley RF maps with side-by-side subfields of opposite polarity were virtually absent from the tree shrew V1, in line with the high degree of subfield overlap and black dominance described above.
Our study is the first comprehensive report on quantitative spatial RF parameters in the tree shrew V1. Our findings related to the RF size are in general agreement with previous results in tree shrews using manual mapping procedures (Kaufmann and Somjen 1979; Chisum et al. 2003). For the 10–30° visual field eccentricity range we studied, the tree shrew RFs were about twice as large as corresponding values in macaque monkey (Gattass et al. 1981). RF shape was generally not circular but rather elliptical, with a strong bias toward the horizontal tilt. We consider it likely that these RF properties are inherited from inputs to V1. In the tree shrew lateral geniculate nucleus, RFs are also horizontally elongated (Holdefer and Norton 1995) and the tree shrew retina exhibits a notable horizontally oriented, elevated density of cone photoreceptors (Muller and Peichl 1989) as well as retinal ganglion cells (Debruyn 1983). In agreement with previous findings in tree shrews (Kretz et al. 1986; Veit et al. 2011), we observed that a majority of neurons responded more vigorously to black stimuli corresponding to local decrements of light. This black dominance is also a striking characteristic of macaque V1 (Yeh, Xing, Shapley 2009; Xing et al. 2010). A notable aspect of our results is the high degree of overlap between white and black stimulus RFs, often referred to as ON and OFF RF subfields in the literature. A large majority of neurons (94%) had overlap indices >0.5, such that sparse-noise pixels of either polarity presented in a similar area of visual space evoked spiking activity. A low incidence of neurons with a small subfield overlap has previously been reported for macaque V1 (Kagan et al. 2002, 15%; Mata and Ringach 2005, 30%), which is consistent with a corresponding low percentage of simple cells identified by spatial criteria in earlier literature (Hubel and Wiesel 1968; Schiller et al. 1976; Foster et al. 1985). Our subfield RF overlap results, while similar to those in macaque monkeys, are in stark contrast to results obtained in the cat, where subfield segregation is observed in 50–70% of neurons (Hubel and Wiesel 1962; Martinez et al. 2005). Based on the overlap criterion, we thus find that only 7% of V1 neurons in the tree shrew are of the simple cell variety, with complex cells making up an overwhelming majority. This finding is in apparent contrast to a study which reported a considerable number of apparent simple cells in the tree shrew V1 (Kaufmann and Somjen 1979). However, cells were classified in that study based on nonstandard criteria such as spontaneous firing rate, orientation TW, and RF size, so we consider it likely that many putative simple cells may in fact belong to the complex cell category.
In addition to the subfield overlap, the F1/F0 modulation ratio is also commonly used to distinguish simple and complex cells, with simple cells exhibiting values >1.0, consistent with robust entrainment of the neural response to the phase of the drifting grating stimulus. Using this criterion, 44% of tree shrew V1 neurons would fall into the simple cell category, a value that is similar to results in several other species, where F1/F0 ratios have also been used for this classification including monkey (Chen et al. 2009), cat (Dean and Tolhurst 1983), wallaby (Ibbotson et al. 2005), rat (Girman et al. 1999), and mouse (Niell and Stryker 2008). While, for example, in the cat, overlap and F1/F0 criteria yield a similar number of simple cells and select roughly the same population of neurons (Skottun et al. 1991), this is evidently not the case in tree shrews or monkeys (Kagan et al. 2002), where there are fewer cells with low subfield overlap than with high temporal firing rate modulations. We suggest that the source of this discrepancy may be the robust stimulus polarity preference present in V1 of these species (Yeh, Xing, Shapley 2009; Xing et al. 2010; Veit et al. 2011). For example, a unit with black preference will be more strongly activated when the dark segment of the grating passes over the RF compared with the bright segment. This relative difference in response strength will induce sinusoidal firing rate modulations, contributing to the F1/F0 ratio. Thus, spatially separate ON and OFF subfields are not required for temporal response modulations, but these can also be at least partly due to stimulus polarity preference (Martinez et al. 2005), as is evidently the case in the tree shrew. Because both black and white RFs are excitatory, this mechanism alone can account for F1/F0 ratios up to a value of 1.0. Since many neurons exhibit higher values, we suggest that incomplete subfield overlap and polarity preference, as well as nonlinear interactive mechanisms between responses to white and black stimuli, and from the RF surround all potentially contribute to the F1/F0 modulation ratio. This hypothesis is supported by the significant correlation between F1/F0 modulation and both the OI and polarity preference. We conclude that, particularly for species with strong black dominance such as monkey or tree shrew, the subfield overlap represents a more reliable criterion than the F1/F0 modulation for identifying simple cells. Indeed, F1/F0 modulation is known to depend strongly on various parameters, including overall firing rate (Hietanen et al. 2013) and stimulus contrast (Crowder et al. 2007). Consistent with this is our observation that subfield overlap, but not F1/F0 modulation or polarity preference, was correlated with orientation selectivity in the tree shrew. Notably, our data show that simple cells, in addition to being rare in the tree shrew, were also not confined to the thalamo-cortical input layers IV and VI, as is the case for example in the cat (Martinez et al. 2005; Hirsch and Martinez 2006). In this respect, the tree shrew is more similar to the monkey, where simple cells are also found in all layers, although macaque V1 displays a higher density of simple cells in thalamo-cortical input layers (Kagan et al. 2002; Ringach et al. 2002; Yeh, Xing, Williams, et al. 2009).
Another apparent similarity between tree shrew and macaque V1 is that orientation selectivity tends to be elevated in supragranular layers in both species (Ringach et al. 2002; Chisum et al. 2003; Gur et al. 2005; Bhattacharyya et al. 2012), although this effect is present only as a trend in the present dataset. In contrast, well orientation-tuned neurons are already present in the granular layer in cat and mouse (Hirsch and Martinez 2006; Niell and Stryker 2008). A pertinent question is to which degree orientation selectivity can be predicted based on the RF structure. This seems to work well in the cat, where there are many cells with spatially segregated subfields (Martinez et al. 2005), and preferred orientation can be well predicted from subfield structure (Ferster et al. 1996; Lampl et al. 2001). In the tree shrew, we find a large overlap between subfields, which are elongated in the visual space for stimuli of both polarities. We therefore asked whether orientation selectivity could be predicted based on RF elongation in the tree shrew and found here that this was clearly not the case: Neither the preferred orientation, nor the magnitude of orientation selectivity, was related to the RF tilt or aspect ratio. An important consideration is that we used sparse-noise for our RF estimation, which due to its locally restricted nature in visual space reveals mostly thalamic inputs rather than recurrent cortical contributions to V1 activity from the activation of nearby cortical columns. This suggests that orientation selectivity does not appear to be computed purely on a feed-forward basis, which is consistent with previous reports that have described a crucial role of cortical recurrent connections in generating orientation selectivity in tree shrews (Fitzpatrick 1996; Chisum et al. 2003; Chisum and Fitzpatrick 2004; Mooser et al. 2004). These studies have found good agreement between orientation selectivity and RF elongation, in apparent contrast to our present results. However, an important technical aspect is that they employed oriented bar stimuli for RF estimation, which cover a considerably larger part of visual space and thus can be expected to activate more recurrent lateral inputs compared with sparse-noise. We therefore estimated RF structure for our neural population using the Hartley subspace stimulus, which covers about 10 times more of the visual field than an individual sparse-noise dot. We show that both orientation preference and selectivity could be well predicted from the Hartley RF tilt and aspect ratio, in agreement with the previous findings in tree shrews and monkeys (Chisum et al. 2003; Yeh, Xing, Williams, et al. 2009). This result is also generally consistent with other studies that have suggested that RF structure can differ substantially dependent on employed mapping stimuli (Ringach et al. 2002; Niell and Stryker 2008; Yeh, Xing, Williams, et al. 2009; Fournier et al. 2011), highlighting the important role of cortical recurrent activity in generating structural and functional aspects of V1 activity.
Taken together, our findings reveal a number of striking similarities between tree shrew and macaque V1, related to simple cell classification and the mechanism for generating orientation selectivity. Our observation that, in the tree shrew, the cortical input layer is already dominated by complex cells together with the overall very low prevalence of simple cells suggests that simple cells may not represent an obligatory step in visual information processing, consistent with previous suggestions that primate V1 in fact contains 2 classes of complex cells, one of which is not dependent on pooling responses from simple cells (Kagan et al. 2002). This has implications for recent theoretical work, which posits that ON OFF segregated RF subfields are at the origin of orientation tuning maps (Paik and Ringach 2011, 2012). While this model may hold in species such as cat or ferret, our present data suggest that it cannot explain orientation domains in the tree shrew where RF subfield segregation is rare and rather weak. Our findings support the notion that there may be 2 separate mechanisms for generating orientation-tuned, complex type neural responses. In carnivores, a substantially feed-forward mechanism is at work, while tree shrews and primates utilize an additional mechanism relying heavily on intracortical processing.
This work was supported by a SNF Prodoc grant PDFMP3_127179 and a ESF EURYI grant PE0033-117106 to (G.R.).
We thank D. Leopold and F. Sommer for helpful comments on the manuscript. Conflict of Interest: None declared.