Functional magnetic resonance imaging (fMRI) was used to estimate the average receptive field sizes of neurons in each of several striate and extrastriate visual areas of the human cerebral cortex. The boundaries of the visual areas were determined by retinotopic mapping procedures and were visualized on flattened representations of the occipital cortex. Estimates of receptive field size were derived from the temporal duration of the functional activation at each cortical location as a visual stimulus passed through the receptive fields represented at that location. Receptive fields are smallest in the primary visual cortex (V1). They are larger in V2, larger again in V3/VP and largest of all in areas V3A and V4. In all these areas, receptive fields increase in size with increasing stimulus eccentricity. The results are qualitatively in line with those obtained by others in macaque monkeys using neurophysiological methods.
A fundamental organizational principle of the primate visual system is that the visual field is represented several times in occipital cortex, once in the primary visual cortex (V1) and once in each of several other visual areas, including V2, V3, V3A and V4 (Hubel and Wiesel, 1977; Zeki, 1978b; Hubel and Livingstone, 1987; Maunsell and Newsome, 1987; Lennie, 1998). These separate visual representations are topographically organized and are broadly arranged in a hierarchy, albeit with strong feedback as well as feedforward connections among them. Other more anterior visual and polysensory areas receive afferents from V1–V4 and these too have independent representations of visual space, although some selectively represent sub-regions of the field and topographic organization is sometimes imprecise or absent. A similar hierarchical arrangement of visual areas has been observed in various non-primate mammal species, such as the cat, and can be assumed to be a pervasive feature of the mammalian cerebral cortex. Recently, the same organization has been demonstrated in the human occipital cortex using functional magnetic resonance imaging (fMRI) (Sereno et al., 1995; Tootell et al., 1995; Engel et al., 1997).
A large literature documents the response properties of single neurons within the ‘early’ primate visual areas, V1–V4. In all these areas, receptive fields are relatively small in the foveal representation and increase in size with stimulus eccentricity. At any given eccentricity, receptive fields are smallest in V1 and become progressively larger through the hierarchy of visual areas (Zeki, 1978b). V1 neurons are commonly <1 deg2 in area in or near the fovea, while neurons in the peripheral field representation of some extrastriate visual areas may have receptive fields >100 deg2. Neurons in any given cortical vicinity typically have receptive fields centred at about the same location in the visual field, but receptive field scatter increases along with receptive field size, both within and across visual areas (Hubel and Wiesel, 1974).
We have developed a novel method of estimating the average receptive field size of neurons at any desired location within the human visual areas, using fMRI techniques. Despite the fact that the spatial resolution of fMRI is very far from sufficient to resolve the activity of individual neurons, we can estimate the average receptive field size of the neurons in one functional voxel (in our case 16 mm3) and study how the average size changes with visual field location and from one visual area to another. Because the early visual areas are retinotopically organized we can assume that, as in other primates, nearby neurons in human visual cortex tend to have similar receptive field sizes. This means that local average measures of size will reflect a relatively small variance among the contributing neurons. The technique we have developed is derived from that used for standard retinotopic mapping in the human visual cortex (Sereno et al., 1995; DeYoe et al., 1996; Engel et al., 1997).
Materials and Methods
The subjects were five healthy male human adults: three of the authors (A.T.S., K.D.S., A.L.W.) and two volunteers who were paid for their time. Subjects were screened in accordance with standard procedures and informed consent was obtained in writing.
Visual stimuli were generated by a Macintosh computer and were projected onto a rear-projection screen covering one end of the bore of the scanner, using an LCD projector (resolution 800 × 600 at 75 Hz). The subject lay on his/her back in the scanner, looking upwards at a mirror in which an image of the projection screen was reflected. The screen was at the end nearest to the subject's head and so the field of view was not restricted by the body. This arrangement gave an image which was approximately circular and had a diameter of 30° (maximum) at the viewing distance of 1.2 m. The mean luminance of the image was 35 cd/m2. Stimulus presentation was synchronized to image acquisition by means of a pulse generated by the computer controlling the scanner.
The stimuli were based on those originally employed for retinotopic mapping by Engel et al. (Engel et al., 1994) and Sereno et al. (Sereno et al., 1995) and were identical to those used in our previous work (Smith et al., 1998,2000). A high-contrast radial checkerboard pattern whose contrast reversed at a frequency of 8 Hz was used. Check size was scaled with eccentricity to produce maximal activation of the visual areas. Two variants were used (see Fig. 1), one for locating boundaries between areas and one for estimating eccentricity. The former (Fig. 1a) was an 80° sector, or wedge, which rotated slowly about the fixation spot, either clockwise or counter-clockwise. It rotated in steps of 20° (18 steps in a complete rotation). It remained in each position for 3 s (the time taken to acquire one set of functional data—see below) before instantaneously rotating to the next position. Each point in the circular image was thus stimulated continuously for 12 s and this occurred once every 54 s, the time taken for one revolution. [Sereno et al. (Sereno et al., 1995) used slow, continuous motion. Our method yields equivalent results but obviates the need to compensate for the different image positions during acquisition of the different functional slices.] In the case of eccentricity mapping, the stimulus was a ring (Fig. 1b) which either expanded or contracted. When it reached the edge of the stimulus (in the expanding case) or the centre (contracting) it ‘wrapped round’ and reappeared at the centre (edge), so that several cycles of expansion (contraction) could be shown. Like the wedge, the ring moved in discrete steps, once every 3 s. The width of the ring and step size were chosen such that each point in the image was again stimulated for 12 s every 54 s.
The same stimuli were used for mapping the boundaries of the various retinotopically organized visual areas of the occipital cortex and for estimating receptive field sizes within them. The procedure for estimating size (see Data Analysis) involves re-analysing the data acquired for the purposes of retinotopic mapping and so one experiment served both purposes.
Imaging was performed with a 1.5 T whole-body Siemens Magnetom (Vision) scanner equipped with a gradient system having 25 mT/m amplitude and 0.3 ms rise-time. The subject was positioned with his/her head in an RF receive–transmit full head coil. Head motion was minimized with a vacuum cap, which was secured within the head coil. Local variations in blood oxygenation (BOLD response) were measured using susceptibility-based functional magnetic resonance imaging, applying gradient-recalled echo-planar imaging (EPI) sequences.
Between 12 and 16 parallel, 4 mm thick planes, positioned in posterior cortex, were imaged every 3 s using a T2*-weighted sequence (TR = 3000 ms, TE = 84 ms, flip angle = 90°, field of view = 256 × 256 mm2, 128 × 128 voxels). The positions of the planes were between axial and coronal, approximately parallel to the calcarine sulcus, and were chosen with the aid of a mid-sagittal T1-weighted scout image so as to include the entire occipital lobe.
Each experimental run lasted for 216 s, during which time the acquisition volume was imaged repeatedly (72 volume acquisitions, 3 s each). In each run, four complete rotations, expansions or contractions of the flickering checkerboard were presented. Eight such runs were conducted, four using wedges and four using rings. In the wedge case, the rotation was clockwise in two runs and counter-clockwise in two. In the ring case, the ring expanded in two and contracted in two.
For each subject, sagittal T1-weighted 3D-MP-Rage images (magnetization-prepared, rapid-acquisition gradient echo; Siemens GmbH, Erlangen, Germany) of the entire brain were acquired (voxel size 1 × 1 × 1 mm3). The anatomical data were used to determine the anatomical localization of functional responses. Such localization was performed using cortical flattening algorithms to obtain two-dimensional (2-D) representations of cortical grey matter (Sereno et al., 1995; Engel et al., 1997).
The data were analysed and visualized using our own in-house software, ‘BrainTools’ (http://www.aston.ac.uk/psychology/meg/mri3dx), with two exceptions (motion-correction and cortical flattening) which are identified below. Each functional volume was first processed using a 2-D motion correction program, ‘imreg’, part of the AFNI package (Cox, 1996). This re-aligns each image in the time series to the average image position. Prior to analysis, spatial smoothing of the functional signal within each slice was performed by convolution with a 2-D Gaussian function (Friston et al., 1995) of SD 1.7 mm. This smoothing reduces spatial noise and, because of the inherent spread of the BOLD effect, the cost in terms of spatial resolution is minimal.
Cortical Flattening and Retinotopic Mapping
Our procedure for cortical flattening and retinotopic mapping is described fully elsewhere (Smith et al, 1998) and is summarized here. A 2-D representation of occipital cortex was derived from the 3-D whole-brain anatomical data set, using an algorithm developed by Teo et al. (Teo et al., 1997). The algorithm simulates a process of flattening a portion of the grey matter (typically centred in the calcarine sulcus) into a 2-D surface. Having obtained a flattened representation of the occipital cortex of each hemisphere in each subject, the boundaries of the retinotopic visual areas were mapped onto it using a procedure based on that of Sereno et al. (Sereno et al., 1995). The temporal phase of the cyclical response produced by the rotating wedge stimulus was established (by finding the phase of the fundamental Fourier component) for each voxel in the acquisition volume. For each voxel, the phase obtained with clockwise rotation of the wedge stimulus was averaged with that obtained with counter-clockwise rotation. This cancels out the phase shift that results from the delay of several seconds between visual stimulation and the peak of the BOLD response, since these phase shifts are in opposite directions for the two rotation directions. The averaged phase angle was then represented as a pseudo-colour overlay on the flattened cortical surface. Boundaries between visual areas appear in such an overlay as a reversal of the direction of change of phase angle. Similarly, the temporal phases of the responses elicited by expanding and contracting rings were established and averaged, then plotted as a pseudo-colour overlay on the flattened cortical surface. Iso-eccentricity lines could then be drawn through regions of constant colour.
Estimation of Average Receptive Field Size
In order to estimate receptive field sizes at different cortical locations, the functional data used for retinotopic mapping were subjected to a quite different analysis, focusing not on the phase of the response profile but on its shape. The principle behind the analysis is illustrated in Figure 2. It has already been used by Tootell et al. (Tootell et al., 1997) to show that average receptive field sizes are larger in human V3A than V1.
Figure 2a shows diagrammatically a single expanding ring stimulus at two points in time. Also shown is the receptive field (RF) of a hypothetical neuron (shown as a small hatched circle). Consider the activity of this neuron as the ring stimulus expands. Assuming the neuron is responsive to the flickering checkerboard stimulus (which was chosen in the hope of driving as many neurons as possible) and is responsive at all locations within the RF, the neuron will start to respond as the leading edge of the ring enters the receptive field (Time 1) and will continue to respond until the trailing edge of the ring leaves the RF (Time 2). Figure 2b is similar, but the RF shown is larger. The diagrams show that the total time during which the neuron is active is greater for the large RF than the small. For an infinitely small RF, the activation period will be equal to 22% of the time taken for one complete expansion cycle of the stimulus, because the width of the ring (2.7°) was 22% of the maximum stimulus radius (12°). The active period as a percentage of total time will increase above 22% as RF size increases. It should be noted that estimates of ‘receptive field size’ based on this principle are not expected to conform perfectly to those derived from single-unit recordings, for a variety of reasons. The most important differences are considered in the Discussion.
In our experiments, the ring executed four complete cycles of expansion (or contraction) in one acquisition series. When the resulting activation is plotted as a function of time, we expect to see four responsive phases interspersed with periods of inactivity, giving (ignoring noise) a rectangular waveform with four cycles. In principle, the duty cycle of this rectangular response profile (the ‘on-time’ as a proportion of total time) in a given voxel gives an estimate of the average receptive field size of the active neurons in the voxel. Since activation was sampled every 3 s and the ring moved 0.67° every 3 s, the resolution of our technique is limited to 0.67° of visual angle.
The same principle applies to a rotating flickering checkerboard wedge (Fig. 2c). It does not matter in which direction the stimulus moves through a given RF, so in fact any of a range of stimuli could be used. But a wedge has a special property which complicates the analysis. When a wedge is used, the ‘on-time’ is a function of eccentricity, as well as RF size. This is because, although the wedge has the same extent at all eccentricities in terms of polar angle, its width in terms of visual angle varies with eccentricity. A given duty cycle therefore corresponds to different RF sizes at different eccentricities (see Fig. 2c). This is not true for expanding/contracting rings, in which duty cycle is independent of eccentricity because the ring has the same width at all locations. Consequently, we based our analysis of receptive field size primarily on data obtained with rings.
For each voxel in each experimental run, duty cycle was estimated by fitting a curve to the temporal response profile. The curve was a rectangular wave which was smoothed by convolution with a Gaussian (standard deviation = 3 s) to simulate the effect of the long time constant of the BOLD response. The best-fitting smoothed rectangular wave was found using a least-squares criterion and the duty cycle of this waveform was used as the basis of an estimate of receptive field size. Data from expanding and contracting rings were analysed separately and the resulting duty cycles were averaged.
Sample activation timecourses are shown in Figure 3 for two different voxels reflecting two eccentricities. The two plots are from the same subject and were acquired simultaneously, in response to the same stimulus (an expanding ring). Figure 3a shows the behaviour of a voxel at a V1 location representing an eccentricity of ~2–3°. The observed response is shown by the solid line. Over the total time period (216 s) the stimulus runs through four expansion cycles and therefore activates neurons in the voxel four times. The dashed line shows the best-fitting smoothed rectangular wave. Figure 3b shows the same thing for another voxel in V1, this time for a segment of cortex corresponding to an eccentricity of ~8–10°. The on-times, or duty cycles, are clearly different in the two cases. The same visual stimulus yields a longer-lasting response (greater duty cycle) at the greater eccentricity. We interpret this difference as evidence for larger receptive fields at the greater eccentricity.
Figure 4 shows typical results in terms of colour overlays on cortical flatmaps. The top row (Fig. 4a–c) shows the same flatmap, representing the occipital cortex of one hemisphere in one subject, with three different coloured overlays. The centre of the flatmap corresponds to a position midway along the fundus of the calcarine sulcus. The foveal representation is on the left of the flatmap, marked with an ‘F’ in Figure 4a. In the first version (Fig. 4a), the temporal phase of the average response to a rotating wedge is shown, to give a standard retinotopic map. The key shows the relationship between colour and visual field position (all in the left hemifield since the cortex shown is from the right hemisphere). The boundaries of the visual areas are marked and are in accord with previous studies (Sereno et al, 1995). Area V1 represents the entire hemifield, its borders representing the upper vertical meridian ventrally and the lower vertical meridian dorsally. Adjacent to V1 lie V2d and V2v, representing the lower and upper quadrants, respectively. These two areas are believed to be functionally equivalent, differing only in field locations represented. Beyond V2 lie V3 dorsally and VP ventrally (sometimes also called V3d and V3v, respectively). These areas also represent one quadrant each and may be functionally equivalent, but in monkeys there is some evidence for important differences in both receptive field properties and cortical connectivity between the two (Burkhalter and van Essen, 1986; Burkhalter et al., 1986; Felleman and van Essen, 1987; Felleman et al., 1997). Beyond V3/VP lies V3A dorsally and V4v ventrally. In primates, there is very clear evidence for differences in RF properties between these two areas (Zeki, 1978a; Gaska et al., 1987), suggesting that they have different functions. As can be seen from the coloured overlays, V3A represents the entire hemifield.
In Figure 4b, the temporal phase of the average response to an expanding/contracting ring is shown as colours. Eccentricity increases on a scale running from black and brown to orange and yellow (see key). Some lines of approximate iso-eccentricity are marked. Measured eccentricity increases with distance from the fovea, as expected. Also shown are the boundaries of the visual areas, from Figure 4a.
In Figure 4c, the colours are derived from the same data set (expanding/contracting ring stimulus) as Figure 4b, but now the colours represent not the phase of the response but the duty cycle of the best-fitting rectangular wave (see Materials and Methods) at each location. The lowest duty cycles (implying the smallest RFs) are indicated by black and dark green and the largest by light green and yellow (see key). The boundaries of the visual areas and the iso-eccentricity lines from the phase maps are superimposed, to provide reference points. As with phase maps, there is considerable imprecision in the data, due in part to noise in the functional data and partly to distortions in the flatmap and errors in mapping the functional data onto it. But it can be seen that the lowest duty cycles are located near the fovea and duty cycle tends to increase with distance from the fovea. This is true for each of the visual areas shown. It is also apparent that, in any given eccentricity band, duty cycles are lowest in V1. As well as a trend for increasing duty cycle with distance from the fovea, there is a similar trend in the orthogonal dimension, duty cycles increasing with distance from V1.
Figure 4d, e show flatmaps for two other subjects with duty cycle overlays of the same type as Figure 4c. Again, the boundaries between visual areas and some iso-eccentricity lines are shown. These were derived from maps of temporal phase of the type shown in Figure 4a and 4b. The estimated location of the fovea is marked with an ‘F’ in each case. In both subjects the trends are similar to those described for Figure 4c. We have obtained similar results in all hemispheres we have studied.
The final panel (Fig. 4f) shows, for the same hemisphere shown in Fig. 4e, duty cycle data obtained using a rotating wedge in place of the ring stimulus. In this case, the principle is the same, but a given duty cycle reflects RF size scaled by eccentricity (see Fig. 2c). There will be one eccentricity at which wedges and rings are expected to give the same result. We calculate that this is 1.6° and, indeed, the colours are similar just outside the fovea in Figure 4e and 4f. If the increase in RF size with eccentricity is such that RFs have a constant subtense in terms of polar angle, then duty cycle should be invariant with eccentricity when a wedge is used (but should still vary across visual areas). In fact, a modest increase in duty cycle with eccentricity is still evident in Figure 4f, suggesting that (if anything) RFs increase in size with eccentricity more than predicted by a simple scaling in proportion to the local width of the wedge.
Figure 5 allows the results shown in the first subject of Figure 4a–c to be seen superimposed on the 3-D brain rather than a flatmap. A 3-D rendered view of the brain of this subject is shown, cut away to reveal the medial surface of one hemisphere. On the left (Fig. 5a) the coloured overlay reflects temporal phases derived from responses to a rotating wedge and shows the lower and upper visual field quadrants above and below the calcarine sulcus respectively. In the centre (Fig. 5b) the overlay shows temporal phases derived from responses to expanding/ contracting rings and shows eccentricity increasing with distance from the occipital pole (the foveal representation). On the right (Fig. 5c) the overlay shows duty cycle data, reflecting average receptive field size, which also increases with distance from the pole.
To quantify the trends noted by inspection of the coloured overlays in Figures 4 and 5, we divided the cortical surface (defined by the flatmap) into a set of sub-regions representing the different visual areas (V1, V2d, V2v, etc.; see Fig. 4a). As can be seen in Figure 4, the standard flatmap often did not include all the visual areas from V3A dorsally to V4 ventrally. This could have been rectified by flattening a larger area of cortex, but only at the expense of disproportionately increasing distortion in the flatmap. We instead made three separate, overlapping flatmaps of each occipital lobe: one with the ‘seed’ (centre) placed in the calcarine sulcus as illustrated in Figure 4, one with the seed placed more dorsally and one placed more ventrally. These were used to define regions of interest corresponding to the central (V1, V2), dorsal (V3, V3A) and ventral regions (VP, V4), respectively.
For each visual area, the irregularly shaped 3-D aggregation of voxels that were included within the area on the flatmap was identified, using ‘BrainTools’. For each voxel in the subset identified, two values were derived from the responses to expanding/contracting rings: the temporal phase of the response (defining eccentricity) and the duty cycle of the response (corresponding to average receptive field size). A scatter plot was then generated in which one value was plotted against the other.
Figure 6 shows a complete set of scatter plots relating duty cycle to eccentricity, for one hemisphere in a typical subject, R.R. The open symbols reflect data points that are unreliable because of proximity to the edge of the stimulus. Duty cycles will tend to fall in this region (and must take the same value at 360° phase as at 0°). For example, a neuron whose RF is centred on the edge of the stimulus, at 12° eccentricity, and which has an RF width of 4° will register a width of only 2° because the part of the RF beyond 12° is never stimulated.
The scatter in the plots is substantial, as might be expected, but in every visual area it is clear that duty cycle increases monotonically with eccentricity. The solid lines in Figure 6 are exponential functions, fitted to the data (solid symbols only) using a least-squares method. For all visual areas, the trend is approximately linear but is, if anything, expansive rather than compressive.
To give an alternative view of the functions, data were binned in terms of phase (eccentricity) and the mean value of each bin plotted. Figure 7 shows such plots, for both hemispheres of a different subject, A.W. The abscissa in Figure 7 is eccentricity, rather than temporal phase as in Figure 6. The transformation is straightforward since the two quantities are linearly related; 0° phase corresponds to zero eccentricity, while 360° phase corresponds to the wrap-around point at the outer edge of the stimulus (12° visual angle from the fovea). No assumptions need be made about changes in cortical magnification with eccentricity, since the relationship between phase and eccentricity holds irrespective of the location of a voxel on the cortical surface (and would hold even if there were no systematic map of eccentricity). The results again show an increase in duty cycle with eccentricity and are similar for the two hemispheres.
In order to combine data across subjects, the binned means were averaged across the 10 hemispheres of the five subjects. The result is shown in Figure 8a. In all visual areas, duty cycle increases with eccentricity. The functions all have the same shape, suggesting that RF size scales with eccentricity in broadly the same way in all areas up to V4. However, there are clear differences in the vertical positions of the curves. At any given eccentricity, the lowest duty cycles are found in area V1. Duty cycles in V2 are substantially larger. Those in V3/VP are larger again and the largest duty cycles are in areas V3A and V4.
For clarity, the data from V2v and V2d have been combined to give a single function for V2 in Figure 8a. Results for V3 and VP are also combined. The breakdown is shown in Figure 8b, which shows separate results, again averaged across 10 hemispheres, for V2d, V2v, V3 and VP. As might be expected, the data for V2d and V2v are quite similar. More interestingly, the data for V3 and VP are also similar. In both instances, the dorsal area (V2d, V3) shows slightly higher duty cycles than the ventral area (V2v, VP), but the difference is not statistically significant and may reflect sampling error. Our data on receptive field size thus lend no additional support to the idea (Burkhalter et al., 1986) that V3 and VP are functionally distinct, although neither do they argue against such differences, since it is not in terms of RF size that they are said to differ in primates. In our recent study of motion sensitivity (Smith et al., 1998) we concluded that second-order motion is first explicitly encoded in V3 (lower hemifield) and VP (upper hemifield) and also found very similar responses in the two areas. We know of no fMRI evidence suggesting any type of functional distinction between VP (V3v) and V3 (V3d) in humans, but the possibility remains open.
The approximate linearity of the functions in Figure 8 is in line with single unit data from macaque monkeys. In macaque area V1, receptive field sizes (in terms of width, or square root of area) are approximately linearly related to eccentricity up to at least 20° (Hubel and Wiesel, 1974). Average field size increases from ~0.25° in the fovea to ~1.3° at 20°. Linearity may break down in and near the fovea (<5°), where RFs, although at their smallest, are bigger than predicted by this relationship (Dow et al., 1981; van Essen, Newsome and Maunsell, 1984). Our results are qualitatively in accord with these findings from primate neurophysiology. It is clear from Figures 6–8 that duty cycle increases markedly, monotonically and approximately linearly with eccentricity. There is no sign of any flattening of the functions near the fovea.
An analysis of variance for repeated measures was performed to test the main effects of cortical distance (eccentricity), hemisphere and area, together with their first-order interactions. There was neither a significant effect of hemisphere, nor was there an interaction with the other effects, so the data were thereafter pooled across hemispheres. The main effects of eccentricity [F(9,72) = 103.7, P < 0.001] and visual area [F(6,48) = 22.7, P < 0.001] are highly significant, whereas the interaction between these main effects was non-significant [F(54,432) = 2.2, P = 0.06]. Pairwise post-hoc comparisons (Scheffé) indicated that these differences were significant between V1 and all other areas. V2d/v also differed significantly from all other areas. The differences among V3/VP, V3A and V4 did not reach statistical significance.
Comparison of Visual Areas
In the visual cortex of macaque monkeys, RF size progressively increases at successively higher levels in the processing hierarchy (Zeki, 1978b). Specifically, at any given eccentricity, RFs are rather more than twice as large in V2 as in V1 (Gattass et al., 1981; Burkhalter and Van Essen, 1986). Burkhalter and Van Essen (Burkhalter and Van Essen, 1986) presented data for area VP and showed that RF sizes are about twice those in V2v at the same eccentricity. Similar estimates for V3 and VP are suggested by the data of Felleman and van Essen () and Gattass et al. (Gattass et al., 1988). RF sizes in V2v and V2d are similar to each other and the balance of evidence, at least, suggests that neurons in V3 and VP are similar in RF size (Gattass et al., 1988), even if not in all other respects.
In area V3A, there have been few studies of RF size, but RFs are larger again than in V3/VP. Gaska et al. (Gaska et al., 1987) claim that V3A neurons have RFs twice the size of those described in V3 by Felleman and van Essen (Felleman and van Essen, 1987). However, they do not report comprehensive data, having examined eccentricities only in the range 2–4°. Nakamura and Colby (Nakamura and Colby, 2000) have recently presented more complete data and these are in accord with earlier estimates. In area V4, RF sizes are quite well documented and are also substantially larger than in V3/VP (Maguire and Baizer, 1984; Gattass et al., 1988).
Figure 9 summarizes the situation. It combines RF size estimates from the various studies discussed above and includes all extrastriate visual areas up to V3A and V4. For clarity, we have averaged comparable estimates from different studies and characterized all functions as linear, rather than showing original data. Comparison of Figures 8 and 9 shows that the estimates of RF size based on the duty cycle of the response to a stimulus moving repeatedly through the visual field are largely in accord with primate physiological data in terms of the ordering of the visual areas. In primates, RFs are smallest in V1, larger in V2, larger again in V3, larger still in V4 and (if anything) largest of all in V3A. Our results follow this pattern, except that V3A and V4 are more similar in our data.
Relationship between Duty Cycle and Receptive Field Size
In principle, it is possible to transform the duty cycle measurements of Figure 8 into receptive field sizes at each location, in degrees of visual angle. This would allow detailed comparisons to be made with equivalent data from single unit recordings in primates. However, a number of factors affect the computation, some of which are currently imponderable, and in practice a reliable transformation is very difficult. Some of the factors that would need to be taken into account in order to effect such a transformation are listed below.
Effect of Receptive Field Scatter
At any given location in the primate visual cortex, receptive fields are centred in broadly the same location in visual space, but none the less show a degree of scatter about the mean location. The coverage of visual space at a given cortical locus is thus determined jointly by average receptive field size and receptive field scatter. In our experiments, RF size and RF scatter are confounded. The duty cycle of activation is affected in the same way by each; small RFs with a high degree of scatter will give broadly the same result as larger RFs with less scatter. Only in the existence of independent estimates of scatter could duty cycles be converted into RF sizes. Hubel and Wiesel (Hubel and Wiesel, 1974) showed that in monkey V1, RF size and RF scatter are correlated. With increasing eccentricity, RFs increase in size and RF scatter increases proportionately. It appears from their data that scatter accounts for about half the overall spread of RFs at a given locus, RF size accounting for the other half. Detailed measurements of RF scatter in monkey V1 were obtained by Dow et al. (Dow et al., 1981), although only in the foveal representation. But detailed estimates of RF scatter at all eccentricities and in all the relevant extrastriate visual areas are not currently available.
Effects of Voxel Size and Magnification Factor
Because, in our experiments, neural activity is integrated over 2–4 mm on the grey matter surface (depending on the orientation of the grey matter relative to the voxel), there will be variation in receptive field positions within each voxel arising from the fact that RF position varies systematically across the cortical surface. The magnitude of this within-voxel spread, in degrees of visual angle, is determined by the voxel size, which is the same at all locations, and the magnification factor M (the extent of cortex in millimetres per degree of visual angle), which falls sharply with increasing eccentricity. The relationship between M and eccentricity in human V1 has been estimated from fMRI data by Sereno et al. (Sereno et al., 1995). Their data indicate that M falls rapidly from about 40 mm/deg in the fovea and asymptotes in the periphery at ~1 mm/deg. This suggests that scatter due to within-voxel position changes will be negligible in the fovea, but will increase to significant levels in the periphery, where one voxel could span up to 4° of visual angle. It would be necessary to correct for this when interpreting duty cycles in terms of receptive field sizes. The effect of the correction would be to reduce RF size estimates in the periphery, making functions such as those in Figure 8 more linear (like those obtained neurophysiologically in monkeys), or even compressive.
Effect of Spatial Summation Properties
The logic of the experiment, as described in the Materials and Methods section, assumes that each neuron responds strongly as soon as the stimulus enters the RF, i.e. spatial summation is assumed to be highly non-linear. Clearly, this assumption is not entirely justified. In both monkeys and cats, spatial summation properties vary considerably among neurons in the same region. Some neurons sum signals within the RF linearly, others have very different properties; for example, compare the ‘standard’ and ‘special’ complex cells of Gilbert (Gilbert, 1977). Receptive fields will therefore tend to be larger than suggested by a given duty cycle, by a factor that is hard to determine and may well vary among the visual areas. It would be necessary to estimate average spatial summation curves for each visual area before it would be possible to derive precise receptive field sizes from our data.
Uneven Sampling of Neurons
Each functional voxel contains perhaps 1–2 million neurons. Our measurements of duty cycle therefore reflect the means of very large groups of neurons. These neurons probably do not contribute equally to the estimate. First, although we used a stimulus designed to drive as many neurons as possible, inevitably some neurons will respond to it better than others simply because of its particular spatial and temporal characteristics. Second, response properties become increasingly complex with distance from V1 and our stimulus may become correspondingly sub-optimal or inappropriate. In V1 we can probably assume that neurons with a range of spatial frequency preferences and RF sizes are activated. But in V4, for instance, it might be that only a specific sub-group of cells are stimulated. If these cells were atypical in terms of RF size, then misleading results would be obtained. Third, at least in monkeys, some neurons do not respond well to extensive stimuli but have inhibitory surrounds or end zones within the receptive field. Such neurons will be under-represented in our estimates. Fourth, at least in macaques, receptive field size and scatter vary considerably among the cortical layers, tending to be greatest in the deep layers. In these layers, cells are less numerous than in the superficial layers, but are larger (and so may have higher metabolic demands and contribute disproportionately to the BOLD response). It is unclear whether these laminar differences bias our estimates or even in which direction they are more likely to be biased.
Effects of Influences from beyond the Classical Receptive Field
Visual neurons are embedded in complex networks of both hierarchical and lateral connectivity. They receive inhibition and facilitation from other neurons with RFs both within and beyond the classical receptive field. The influence of those beyond the CRF are minimized in neurophysiological experiments, in which receptive field sizes are usually estimated using a small stimulus which is essentially confined to the receptive field. Although such a stimulus will activate many neurons, which might interact, all of those neurons have receptive fields in the same vicinity. In our experiments, the stimulus was much larger than the receptive fields of the cells it activated, so that many other neurons with adjacent or even quite distant receptive fields were simultaneously active. A growing literature suggests that the responses of neurons can be profoundly affected by such activity. It is therefore unsafe to assume that estimates of RF size obtained with our stimuli can be expected to correspond to those obtained with small stimuli. The differences will be complex and it is not currently possible to compensate for them accurately. This complicates the comparison of results with neurophysiological data.
Effects of Fixation Instability
During fMRI experiments, fixation is inevitably imperfect. This has the effect of blurring the location of activation on the cortical surface and will tend to lengthen the duration of activity in each voxel. These effects are, of course, much less in anaesthetized neurophysiological preparations and this difference might be important in comparing human and non-human primates. We have measured eye position in two subjects during a simple fixation task, conducted in the scanner, and found that it is characterized by frequent microsaccades whose average amplitude is ~0.7°. In order to compensate for fixation instability in calculating human receptive field sizes, it would be necessary to model its effect on the temporal activation profile and, hence, duty cycle. Clearly, detection of very small receptive fields will be compromised.
The latency and duration of responses varies among neurons and across visual areas. The effects on our duty cycle estimates may be negligible in V1, but duration of response, in particular, becomes an increasing concern with distance from V1 as RF properties become more complex and non-retinal factors become more pronounced.
These various complicating factors make it very difficult to provide reliable estimates of mean receptive field size in human visual cortex. None the less, we have reported findings that lead to some clear conclusions, at least in qualitative terms. Although we cannot estimate absolute receptive field sizes accurately, we have shown clear differences in relative receptive field size across eccentricities and across visual areas. In many respects, the results are in line with those expected on the basis of the primate literature. Specifically: (i) RFs increase in size with eccentricity and (ii) RFs are smallest in V1 and increase with hierarchical distance from V1. However, one difference between our results and those found neurophysiologically in monkeys merits comment. Comparison of Figures 8 and 9 shows that, although the visual areas are generally ordered in the same way in man and monkey, there is one difference. Physiological data suggest that RFs in V3A (Nakamura and Colby, 2000) are somewhat larger than those in V4 (Gattass et al., 1988), whereas the two areas appear to have RFs of a similar size in humans. But it is possible that either (a) our estimates are differentially affected by the complicating factors listed above or (b) the apparent difference between V3A and V4 in primates, which can only be inferred (no study having compared the two areas directly and in detail using the same measurement method), is not a real difference. It is therefore unsafe to assert that there is a species difference in this regard, especially since the two methodologies are so different. Perhaps the best way to resolve this issue would be to conduct experiments similar to ours on macaques.
This work was supported by grants from The Wellcome Trust to A.T.S. and from DFG (grant SFB 517, C9) to M.W.G. M.W.G. was supported by the Schilling Foundation. We thank Professor Dr J. Hennig for the use of MRI facilities in Freiburg.