Lesion studies argue for an involvement of cortical area dorsal medial superior temporal area (MSTd) in the control of optokinetic response (OKR) eye movements to planar visual stimulation. Neural recordings during OKR suggested that MSTd neurons directly encode stimulus velocity. On the other hand, studies using radial visual flow together with voluntary smooth pursuit eye movements showed that visual motion responses were modulated by eye movement-related signals. Here, we investigated neural responses in MSTd during continuous optokinetic stimulation using an information-theoretic approach for characterizing neural tuning with high resolution. We show that the majority of MSTd neurons exhibit gain-field-like tuning functions rather than directly encoding one variable. Neural responses showed a large diversity of tuning to combinations of retinal and extraretinal input. Eye velocity-related activity was observed prior to the actual eye movements, reflecting an efference copy. The observed tuning functions resembled those emerging in a network model trained to perform summation of 2 population-coded signals. Together, our findings support the hypothesis that MSTd implements the visuomotor transformation from retinal to head-centered stimulus velocity signals for the control of OKR.
The dorsal medial superior temporal area (MSTd) is located in the posterior parietal cortex (PPC) and is part of the visual motion processing system (Andersen 1989). Besides pure retinal signals, neurons in this area are also driven by vestibular and eye movement-related signals (Newsome et al. 1988; Gu et al. 2006; Ono and Mustari 2006). This multi-modal behavior led to the suggestion that MSTd might compensate for distortions caused by self-generated eye and head movements in the perception of heading direction (Bradley et al. 1996; Page and Duffy 1999; Gu et al. 2007; Bremmer et al. 2010).
Apart from its perceptual function, however, there is also strong evidence for an involvement of MSTd in oculomotor control. Lesions in this area lead to severe impairment of the optokinetic response (OKR), which is the involuntary eye movement that compensates for planar motion of the visual scene (Dürsteler and Wurtz 1988; Takemura et al. 2007). Analysis of neural latencies (Kawano et al. 1994) and postsaccadic response behavior (Takemura and Kawano 2006) gave further evidence for the participation of MSTd in OKR. A recent series of studies using a smooth pursuit paradigm in combination with planar visual stimulation suggested that MSTd neurons might encode the velocity of large-field visual stimuli in head-centered coordinates (Inaba et al. 2007, 2011; Inaba and Kawano 2010). Yet, it is still unknown which exact function this cortical region might serve during visuomotor transformation.
The term ‘visuomotor’ refers to the neural mechanisms by which visual stimuli are converted into motor commands. One essential processing step is the transformation of retinal or eye-centered signals to the body-centered coordinates of muscles for movement (Andersen et al. 1993; Crawford et al. 2011). A number of regions in the PPC are assumed to be involved in these coordinate transformations for various kinds of movements. For instance, the parietal reach region (PRR) is supposed to be a visuomotor interface for reaching arm movements, whereas the lateral intraparietal area (LIP) might serve such function for saccadic eye movements (Snyder 2000; Buneo and Andersen 2006). Tracking eye movements such as smooth pursuit or OKR require a transformation of the retinal image velocity signal to a head-centered stimulus velocity signal (Young 1971; Robinson et al. 1986; Nuding et al. 2008).
Neural network models have demonstrated that gain fields might comprise the neural substrate for performing coordinate transformations (Zipser and Andersen 1988; Pouget and Sejnowski 1997; Beintema and van den Berg 1998). Gain-field-like tuning behavior is characterized by a modulation of the neural response depending on a certain variable, without changing the actual receptive field characteristics in relation to another variable (Salinas and Thier 2000). Andersen and Mountcastle (1983) were the first to observe this kind of tuning in area 7a of the PPC, where visually responsive neurons are modulated by the eye's position.
In this work, we analyzed MSTd neural responses during continuous planar optokinetic stimulation. Using a novel information-theoretic technique for neural data analysis, we determined high-resolution tuning functions and compared them with the predictions from a neural network model trained to perform a transformation of the retinal image velocity signal to a head-centered stimulus velocity signal.
We recorded extracellular potentials from 81 MSTd neurons in 2 monkeys (Macaca mulatta; monkey A: 70 neurons, monkey B: 11 neurons). All procedures were performed at the Washington National Primate Research Center at the University of Washington (Seattle, WA, USA) in compliance with National Institutes of Health Guide for the Care and Use of Laboratory Animals. The protocols were reviewed and approved by the Institutional Animal Care and Use Committee at the University of Washington. Surgical procedures were performed in a dedicated facility using aseptic techniques under isoflurane anesthesia (1.25–2.5%). Vital signs including blood pressure, heart rate, blood oxygenation, body temperature, and CO2 in expired air were monitored with a Surgivet Instrument (Waukesha, WI, USA) and maintained in normal physiological limits. To permit single unit recording, we used stereotaxic methods to implant a titanium head stabilization post and a titanium recording chamber (Crist Instruments, MD, USA) over MST cortex (posterior = 5 mm; lateral = 15 mm). In the same surgery, a scleral search coil for measuring eye movements was implanted underneath the conjunctiva of one eye. Postsurgical analgesia (buprenorphine 0.01 mg/kg, IM) and anti-inflammatory (banamine 1.0 mg/kg, IM) treatment were delivered every 6 h for several days, as indicated. For verifying MSTd location, we used functional, histological, and magnetic resonance imaging criteria (described in detail in Ono et al. 2010). During the experiments, monkeys were seated in a primate chair in a dark room with their head restrained in the horizontal stereotaxic plane. About 10% of the isolated neurons in MSTd did not respond selectively to large moving visual stimuli and were discarded from our analysis.
The OKR is a tracking eye movement to moving large-field scenes. The term ‘ocular following response’ (OFR) generally refers to the reflexive eye movements elicited by brief motion of the visual scene (Miles et al. 1986). OFR is characterized by very short onset latency and can be regarded as initial phase of the OKR. The combination of OKR and fast resetting saccades during prolonged unidirectional stimulation is called optokinetic nystagmus. Tracking eye movements to small moving objects are called smooth pursuit. In our experiment, a visual large-field stimulus (35° × 35° fixed random dot pattern with mean luminance of 100 cd/m²) was rear projected on a tangent screen. During the ‘white noise motion’ paradigm, the stimulus was moving randomly back and forth in one direction in a translational, planar way (Duffy and Wurtz 1991), meaning that the position of the pattern on the screen changed continuously, but not the pattern itself (Fig. 1A). Each neuron was tested in the axis of its preferred direction, i.e., the direction which elicits maximal spiking activity. For determination of the preferred direction, the large-field pattern moved in a circular (not spiral) trajectory as a search stimulus. This ensures that every contrast element in the pattern moves at the same speed, through all directions. The preferred direction was estimated from the on-line response of each neuron. Different speeds (typically between 10 and 60°/s) were used to test each neuron in the preferred direction. Further testing was conducted along one of 4 directions (vertical, horizontal, diagonal left, or diagonal right) that most closely matched the estimated preferred direction. Frequencies of the white noise motion stimulus were equally distributed. Both, eccentricity of the center of the stimulus pattern and stimulus velocity, were normally distributed. The stimulus range was adapted to the monkey's behavior. Maximal eccentricity and maximal stimulus velocity reached from ±20° to ±40°, and from 70 to 200°/s. The ranges for the standard deviation of eccentricity and stimulus velocity were 5°–10°, and 20°/s–60°/s, respectively. The monkeys previously were trained to deliberately follow any large-field stimulus motion they observed. Reward was given for keeping eye position in the range of ±5–7° around the center of the stimulus pattern. In this experiment, however, stimulus velocity repeatedly exceeded maximal tracking eye velocity (∼60°/s) for short periods. Together with frequent and rapid changes in stimulus direction and velocity, perfect tracking was impeded and the image velocity and eye velocity signals were decoupled. Unpredictable motion and high stimulus velocities were intended to prevent the monkeys from continuous tracking of certain dots of the pattern resulting in OKR instead of performing smooth pursuit. Supplementary Video 1 exemplifies the white noise motion paradigm with actual 2D eye movements. A sample trace and corresponding extracellular recording in MSTd is shown in Figure 1B. Over a range of several days, the monkeys improved in following the stimulus. In order to minimize the dependency between the image and eye velocity signals, we increased the stimulus range according to the monkey's behavior. We also used an algorithm for de-correlation of image and eye velocity, which is described in Data Analysis.
The average recording time per dataset was ∼600 s. Action potentials were detected using a hardware window discriminator (Bak Electronics, MD, USA). Eye position signals were processed with anti-aliasing filters at 200 Hz using 6-pole Bessel filters (latency 5 ms) before digitization at 1 kHz with 16-bit precision. Subsequently, they were filtered with a Gaussian low pass (cutoff frequency 30 Hz) and 3-point differentiated to obtain velocity traces. Saccade periods were detected and removed from the data using a previously described algorithm (Ladda et al. 2007). Briefly, an estimate of the slow-phase component (SPC) was initialized to zero and iteratively improved in each step. The difference between the actual eye velocity trace and the current SPC served as an estimate of the fast-phase component (FPC). When the FPC exceeded a threshold (100°/s in the first step, 50°/s in the second step), a saccade was detected. The new SPC was then computed by linear interpolation of the eye velocity across saccades and subsequent filtering with a Gaussian low pass (cutoff frequency: 5 Hz in the first step, 10 Hz in the second step).
Information-Theoretic Data Analysis
Our mutual information based approach for neural data analysis has been described in detail previously (Brostek et al. 2011). Briefly, let S denote a binary random variable for the observation of a spike or non-spike, with p(s) denoting the probability mass function of spike occurrence. The discrete random variable V denotes the observation of a specific combination of explanatory variables with associated probability mass function p(v). Herewith, it is possible to define a probabilistic neural tuning function by the conditional probability p(s|v) of observing a spike given any combination of explanatory variable. By multiplication with the sampling rate, this probability translates directly into an expectation value of the spiking rate. Using Bayes' theorem, p(s|v) can be expressed as the quotient of the joint probability mass function p(s,v) divided by p(v):
Estimates of p(v) and p(s,v) can be attained by histogramming the experimental data (Fig. 1C). For each dataset, the optimal number of bins was estimated by using an algorithm proposed by Knuth (2006). To smoothen the histograms, we used a symmetrical Gaussian low-pass filter with a standard deviation of 2 bin widths. This approach of tuning function determination poses 3 major challenges: first, the variable space needs to be fully covered, while ensuring statistical independence of all explanatory variables. Second, neural latencies have to be considered, as the tuning function p(s|v) critically depends on it. Third, the dependence of neural activity on each explanatory variable needs to be estimated. Concerning the former issue, we used a white noise motion paradigm as described above. Additionally, we used an algorithm to remove samples with high linear correlation from the datasets. This algorithm randomly selects parts of the original dataset for the analyzed set as long as a maximal level of linear correlation between image velocity and eye velocity is not exceeded. We set this limit to a Pearson's correlation coefficient of 0.2. During the latency estimation process, this procedure was repeated for all combinations of time-shifted explanatory variables. The average percentage of discarded samples per dataset was 39.7 ± 23.3%. Bins containing <32 samples were omitted from the analysis. To estimate the neural latency, we analyzed the mutual information I between explanatory variables V and spiking activity S:
The tuning index (TI) quantifies the ‘slope’ of the 2D tuning function and was determined as follows: let x and y denote the 2 axes of the tuning function, and z the activity axis. In this 3D coordinate system, a plane
Negative TI-values indicate predominant vertical tuning, corresponding to higher selectivity for image velocity, whereas positive values indicate horizontal, eye velocity-related tuning. We used the function ‘regress’ from the Matlab Statistics Toolbox (The MathWorks Inc., Natick, MA, USA) for estimating the TI-values and the 95% confidence intervals of the estimated parameters. Based on these, a 95% confidence interval was determined for each TI-value.
We used 2 different models: a system-level model of the OKR control system (Fig. 4A) and a neural network model of the coordinate transformation (Fig. 4C). The system-level model, a simplified version of previous models (Marti et al. 2008; Nuding et al. 2008), simulates the interaction and signal flow between different anatomical regions during OKR and was implemented in Simulink 7.1 (The MathWorks Inc., Natick, MA, USA) using standard differential equation solver settings. The transfer functions for the eye plant, which models the inertia of the eye-muscle system (Glasauer 2007), and the internal plant model, which estimates eye velocity from an efference copy of the motoneuron signal, are given in Figure 4A.
The neural network model was adapted from Pouget and Sejnowski (1997). The 31 image velocity input units were Gaussian tuned with σ = 32°/s and equally distributed peaks pi between −60°/s and 60°/s:
In the first part, we present our electrophysiological findings. Thereafter, we compare these data with theoretical predictions from a computational model.
MSTd Neural Tuning During Optokinetic Stimulation
The visual stimulus consisted of a large-field random dot pattern moving randomly in the axis of preferred direction, which was separately determined for each neuron. The monkey's task was to follow this planar ‘white noise motion’ stimulus as well as possible (see Methods). Maximal stimulus velocities of up to 200°/s allowed us to cover wide ranges of both eye velocity and retinal image velocity values at the same time.
For analyzing the data, we used a novel approach (Brostek et al. 2011), which estimates the latency and dependence of neural activity on different visual and eye movement-related variables based on maximization of mutual information. Neural activity correlated mainly with 2 variables: image velocity (or retinal slip) and eye velocity, which accounted on average for 35% and 30% of mutual information with spiking activity, respectively. The remaining mutual information was shared by eye acceleration (13%), image acceleration (12%), and eye position (10%), showing that the latter 3 variables correlated marginally with neuronal activity.
Figure 2A shows the latency of spiking activity relative to the image and eye velocity signals. All neurons fired after the image velocity signal, with a mean neural latency of 63 ± 27 (SD) ms. This value for the visual response latency agrees well with previous findings (Kawano et al. 1994; Schmolesky et al. 1998). For eye velocity, the distribution was bimodal. Most neurons (77%) fired before the eye velocity signal. For these neurons, the average eye movement-related latency was −29 ± 21 ms. This signal therefore cannot be of sensory origin, but indicates a premotor variable, which probably reflects an internally generated efference copy of the eye velocity signal. The resulting latency of the eye movement relative to the image velocity signal of ∼90 ms is in accordance to previous OKR ramp onset measurements (Ono et al. 2010). Nineteen neurons exhibited positive neural latency relative to eye velocity. We excluded this subpopulation from subsequent analysis and will refer to these units in the next section.
Figure 2B–D provide an overview of 2D tuning functions determined from our recordings. Neural latencies were estimated for each neuron and compensated in the following analysis. The stimulus range was adapted continuously to the monkey's behavior and therefore varied across datasets. Preferred retinal image velocity differed across the population in accordance with previous findings (Churchland and Lisberger 2005). Figure 2B shows tuning functions of 9 example neurons, which exhibited clear gain-field-like behavior with an approximately bell-shaped selectivity for image velocity, which is strongly modulated by eye velocity. In all gain-field neurons, the visual response increased with increasing eye velocity in the neuron's preferred direction, and decreased with eye movement in the opposite direction. The shape of the tuning functions, however, differed notably across neurons. Some tuning functions were comparatively ‘flat’, with neurons showing less selectivity for image velocity with increasing eye velocity (e.g., A97.4 and A100.2). In other units, tuning functions exhibited rather ‘narrow’ forms, preserving their visual selectivity for high eye velocities (e.g., B28.1). In some cases, preferred velocity seemed to shift towards lower values with increasing eye velocity (e.g., A94.5).
In 29 neurons (36%, monkey A: 24, monkey B: 5), tuning functions expressed gain-field-like behavior as shown in Figure 2B. Another 33 units (41%, monkey A: 29, monkey B: 4) showed a modulation of the visual response with increasing eye velocity, but had open, sigmoid tuning for image velocity, thus resembling part of a gain field (Fig. 2D, left). We therefore refer to these units as ‘partial gain fields’. Prior studies in the neighboring middle temporal cortex (MT) that tested for wider velocity ranges usually found closed image velocity tuning functions (Mikami et al. 1986). As the image velocity signal is assumed to be projected via MT to MST (Tusa and Ungerleider 1988), the open image velocity tuning in part of our neurons probably resulted from a limited testing range. In 7 cases (9%, monkey A: 5, monkey B: 2), neurons exhibited virtually pure selectivity for image velocity (Fig. 2D, middle), whereas in 2 neurons (2%, monkey A only) neural activity increased mainly with eye velocity, showing marginal modulation by image velocity (Fig. 2D, right). The distributions of tuning types for monkey A and monkey B were not significantly different from each other (Pearson's χ2-test: P = 0.65).
To quantify the ‘slope’ of the 2D tuning functions, we determined the TI (see Methods) for all neurons (Fig. 2E). A negative TI-value indicates stronger dependency of neural activity for image velocity, whereas a positive value indicates more eye velocity-related tuning. A TI-value of 0 indicates uniform increase of neural activity with image and eye velocity. This relation would be expected in neurons that directly encode stimulus velocity, which is the sum of image and eye velocity. However, only 3 units (4%) had TI-values that were not different from zero on a 5% significance level. We rather found a continuous, Gaussian-shaped (Lilliefors test: P = 0.31; mean = 0.05, σ = 0.82) distribution of estimated TI-values, ranging from −2.8 to 2.7. Hence, no specific form of gain field dominated the population. The 2 special types of pure image velocity-related (Fig. 2D, middle) and pure eye velocity-related tuning (Fig. 2D, right) represent extreme values of this distribution. Statistical comparison of the TI-values for monkey A and monkey B suggests that both samples come from the same underlying distribution (two-sample Kolmogorov–Smirnov test: P = 0.13). Figure 2F shows the population's mean firing rates for image and eye velocity values of 20°/s extracted from the tuning functions. This Gamma-shaped distribution (Kolmogorov–Smirnov test for Gamma distribution with α = 1.65, β = 27.63: P = 0.43) covers a wide range from 10 to 200 Hz.
Neurons with Late Eye Movement-Related Responses
In a subpopulation of 19 neurons (23%), spiking activity had a mean latency of 105 ± 39 ms relative to the eye velocity signal (Fig. 2A). Similar latencies after eye movement onset are typically observed in ‘smooth pursuit neurons’ (Newsome et al. 1988; Ono et al. 2010), a type of MSTd neuron exhibiting increased activity during tracking of small moving targets. Most neurons of this subpopulation not only differed in neural latency but also exhibited significant differences in tuning behavior. In 6 out of these 19 units (32%), the tuning functions exhibited peak values for positive image velocity and negative eye velocity (Fig. 3). Such opposite preferred directions for visual and eye movement-related activity were also previously reported in MSTd smooth pursuit neurons (Komatsu and Wurtz 1988; Shenoy et al. 2002). In another 5 out of the 19 neurons (26%), the mutual information between spiking activity and eye position or acceleration exceeded the mutual information values between spiking activity and image velocity. This means that neuronal activity in these neurons was best correlated with eye movement rather than with image related variables. The subpopulation of MSTd neurons with positive neural latency relative to eye velocity is probably not participating in OKR. Rather, it might be involved in smooth pursuit control (Ono and Mustari 2006; Ono et al. 2010) or pursuit compensation for the perception of heading direction (Bradley et al. 1996).
Figure 4A shows a system-level model of the OKR control circuit, analogous to well-established models of the smooth pursuit system (Robinson et al. 1986; Glasauer 2007; Marti et al. 2008). The ‘eye plant’ is usually modeled by a low-pass filter with a time constant of 200 ms and simulates the inertia of the eye-muscle system (Glasauer 2007). It receives as input the motoneuron signal and yields as output the eye velocity signal. The signal processing time in retina, thalamus, and visual cortical areas is modeled by a 60 ms delay, resembling the measurements from our MSTd data. Due to this delay, a pure negative-feedback circuit, with image velocity driving the eye plant directly, would not be stable. A straightforward way to prevent instability is the introduction of an internal positive feedback to the control signal (Young 1971; Robinson et al. 1986; Nuding et al. 2008). An internal model of the eye plant, presumably located in the cerebellum (Wolpert et al. 1998; Glasauer 2003; Porrill et al. 2004; Marti et al. 2008) or PPC (Mulliken et al. 2008), might receive an efference copy of the motor command and use it to estimate eye velocity. The summation of image velocity and estimated eye velocity yields the estimated stimulus velocity signal. This processing step can also be conceived as transforming a retinal signal to head-centered coordinates. The resulting signal can be used to drive the eye plant. To account for neuronal processing time in premotor brain stem and cerebellar areas (Büttner and Büttner-Ennever 2006), we introduced an additional delay of 20 ms to the eye plant. The resulting model of the OKR system is stable and yields realistic step function responses with an eye movement onset latency of 80 ms (Fig. 4B). It should be mentioned that leading of the internally estimated eye velocity signal relative to the actual eye velocity signal (i.e., negative latency) is not critical for stability. The system is also stable for a positive delay of the internal eye velocity signal, as would be expected for proprioceptive feedback. In this case, however, the model's step response exhibits slower rising and approximates the empirical observations less well.
To analyze the sensorimotor coordinate transformation at a deeper, more biologically plausible level, we designed a firing-rate neural network to model the summation of retinal image velocity with eye velocity (Fig. 4C). As in our electrophysiological study, we restricted this analysis to one (preferred) direction. Three different neural populations coded the velocities of image, eye, and stimulus. Using the back-propagation learning algorithm (Rumelhart et al. 1986), the network was trained to estimate the proper stimulus velocity value for any given combination of image and eye velocity input values. The shapes of the resulting tuning functions of the network's intermediate layer units had remarkable similarity to the tuning functions we found in MSTd. All tuning functions obtained from the simulation exhibited a kind of eye velocity gain-field shape. Some neurons had narrow, image velocity-related tuning functions, whereas other units showed rather flat, eye velocity-related tuning. The similarity to our electrophysiological results was corroborated by the neural network's distribution of TI-values (Fig. 4D). Before learning all weights were randomly sampled from a uniform distribution. The resulting tuning functions were quite similar in shape, reflected by a narrow distribution of TI-values. During the training process, however, this distribution broadened, demonstrating the network's demand for a certain diversity of flat and narrow gain fields for accomplishing the transformation task. After completion of the learning process, the distribution of TI-values showed no significant differences to our MSTd data (two-sample Kolmogorov–Smirnov test: P = 0.21).
Figure 4E shows the average activity of the trained intermediate units for image and eye velocity values of 20°/s after learning. As in our data, the activity was also not Gaussian distributed (Lilliefors test: P = 0.013). However, the neural network's units exhibited much less variability in their activity than MSTd neurons. The distribution of input and output weight values is shown in Figure 4F. Input weights for image (U) and eye velocity (V) were distributed almost equally between −0.02 and 0.07 after learning. Also the distribution of output weights (W) was flat and almost symmetrical between −0.05 and 0.055.
Our analysis revealed a large diversity of eye velocity gain-field types in MSTd including asymmetric and non-separable tuning functions. The distribution of observed gain-field shapes was almost identical to the predictions from a neural network model trained to perform the summation of image and eye velocity. Assuming participation of MSTd in the OKR control system (Dürsteler and Wurtz 1988; Takemura et al. 2007), our findings substantiate the hypothesis that this area implements the stage in visuomotor processing where a retinal image velocity signal is transformed to a head-centered estimate of stimulus velocity. Negative latency of the modulatory activity component indicates its premotor character and argues for an estimate of the eye velocity signal, generated by an internal model of the eye plant system.
In the present study, we used a white noise motion paradigm for optokinetic stimulation. Random motion and high stimulus velocities were intended to prevent continuous tracking of certain dots of the pattern. Yet, it is not possible to completely exclude the possibility that the monkeys were performing a mixture of OKR and smooth pursuit eye movements. The observed short eye movement latency of ∼90 ms, however, argues against major participation of the smooth pursuit system, as the latter typically exhibits latencies of 120–140 ms for randomized motion onset (Ono et al. 2010). Even though predictive mechanisms could in principle shorten pursuit eye movement latency, in our experiment, the highly randomized motion pattern minimized the possible contribution of such mechanisms.
Gain fields have been found before in various other areas of the PPC. For instance, visual responses of neurons in LIP and cortical area 7A are gain modulated by eye and head position signals (Snyder et al. 1998). The activity of neurons in PRR is modulated by eye and limb position (Chang et al. 2009). Nevertheless, all studies so far have suffered from the problem that characterization of the neural responses was incomplete in the sense that only very few and specific combinations of visual input and motor output could be tested. Our white noise motion paradigm overcomes these difficulties and allows us to characterize neural tuning over a large range of input–output values, thereby enabling us to analyze the distribution of different gain-field types. The finding of a well-defined subpopulation with differing tuning and latency behavior is in agreement with previous MSTd studies and further demonstrates the strengths of our approach.
Previous electrophysiological studies in area MSTd predominantly focused on its role in perception of self-motion and heading direction. Most of these were using paradigms consisting of radial visual stimulation in combination with smooth pursuit eye movements. As MSTd neurons generally exhibit different behavior during smooth pursuit and OKR (Komatsu and Wurtz 1988; Kawano et al. 1994; Ono et al. 2010), as well as for radial and planar visual stimulation (Duffy and Wurtz 1991), our results are not directly comparable with these studies. Nevertheless, a number of studies using smooth pursuit and radial stimulation paradigms found that visual responses of MSTd neurons are modulated during eye movements (Bradley et al. 1996; Page and Duffy 1999; Shenoy et al. 1999, 2002; Ben Hamed et al. 2003; Bremmer et al. 2010), which is in compliance with our results. Other studies that investigated neural tuning in MSTd during smooth pursuit and planar visual stimulation yielded diverging conclusions. A series of works by Kawano and colleagues suggested that most MSTd neurons fully compensate eye movements and thereby directly encode the velocity of a large-field visual stimulus in head-centered coordinates (Inaba et al. 2007, 2011; Inaba and Kawano 2010). In contrast, Chukoskie and Movshon (2009) found that MSTd neurons exhibit a variety of tuning behaviors ranging from pure retinal to head-centered stimulus velocity coding, conforming our findings.
Lesions in area MSTd severely impair OKR eye movements (Dürsteler and Wurtz 1988; Takemura et al. 2007). By using an optokinetic paradigm in this study, we could assume a participation of the analyzed neurons in oculomotor control. This allowed us to shift our focus from the question ‘which signals are coded?’ to ‘what functions are implemented in this area?’. The coordinate transformation hypothesis offers a straightforward explanation for the diversity in tuning behaviors found in MSTd.
Traditionally, most researchers attempted to correlate neuronal activity with certain variables, assuming direct encoding of sensory or motor signals by different neural populations. This approach may be appropriate for early input or output stages of neuronal processing. However, it poses difficulties when intermediate processing steps of sensorimotor transformation are analyzed. Theoretical work has shown that a neural coding scheme where each object in each reference frame is represented by a different set of neurons will quickly reach limitations due to the combinatorial explosion in the number of required cells (Poggio 1990). It was therefore suggested that a much more efficient scheme for neuronal representation might be used: instead of representing each variable by a certain pool of neurons, one set of basis functions can represent a number of different variables simultaneously. Arbitrary variables are then represented by simple linear combination of these basis functions (Girosi et al. 1995; Pouget and Sejnowski 1997).
Gain fields, as demonstrated by Pouget and Sejnowski (1997), exhibit all characteristics necessary to form a set of basis functions. The diversity of tuning functions we observed in our data is consistent with this theory. Hence, eye velocity gain fields in MSTd could be used to generate a number of other visual motion-related variables, as for instance an estimate of heading direction (Ben Hamed et al. 2003) or perceived self-motion velocity. In our case of planar visual stimulation, perceived self-motion velocity is the stimulus velocity signal directed towards the opposite side. Such inversion can be easily obtained by changing the weights of the connections to the output layer in our neural network model. The self-motion signal might be generalized for head and body motions by the inclusion of vestibular information (Gu et al. 2006, 2007). Our results are therefore compatible with the idea of area MSTd serving various functions in self-motion perception, as well as in oculomotor control.
This work was supported by the German Federal Ministry of Education and Research Grants 01GQ0440 (BCCN), 01EO0901 (IFB), National Institutes of Health Grants EY06069, EY013308, P51 OD010425, and ‘Research to Prevent Blindness’.
We are grateful to Seiji Ono for help in collecting the neurophysiological data. Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.