The coordinated action of the eye and the hand is necessary for the successful performance of a large variety of motor tasks based on visual information. Although at the output level the neural control systems for the eye and the hand are largely segregated, in the parietal cortex of the macaque monkey there exist populations of neurons able to combine ocular and manual signals on the basis of their spatial congruence. An expression of this congruence is the clustering of eye- and hand-related preferred directions of these neurons into a restricted region of the workspace, defined as field of global tuning. This domain may represent a neural substrate for the early composition of commands for coordinated oculo-manual actions. Here we study two different prototypical network models integrating inputs about retinal target location, eye position and hand position. In the first one, we model the interaction of these different signals, as it occurs at the afferent level, in a feed-forward fashion. In the second model, we assume that recurrent interactions are responsible for their combination. Both models account surprisingly well for the experimentally observed global tuning fields of parietal neurons. When we compare them with the experimental findings, no significant difference emerges between the two. Experiments potentially able to discriminate between these models could be performed.
Coordinated eye–hand movement is common to different forms of visuomotor behavior. Our ability to make combined or independent eye and hand movements, such as when we look at and reach towards an object of interest, or when we reach towards it while looking elsewhere, must reside in a neural process that encodes retinal, eye and hand related inputs in a congruent fashion. In humans, lesions of the parietal lobe (Balint, 1909) disrupt the natural control exerted by vision on movement, and result in directional errors in making purposeful hand reaches (optic ataxia) and eye movements (psychic paralysis of gaze, or gaze apraxia; see De Renzi, 1982) to visual targets. Recently, several studies of the parietal cortex (Ferraina et al., 1997a,b, 2001; Snyder et al., 1997; Batista et al., 1999; Battaglia-Mayer et al., 2000, 2001; Buneo et al., 2002) have highlighted the importance of the integration of eye and hand related information for coordinated eye–hand movements, thus suggesting a link between the disruption of the physiological processes and the parietal deficits (Battaglia-Mayer and Caminiti, 2002). Furthermore, studies in naturalistic settings, such as tea-making (Land et al., 1999), and during natural manipulation of objects (Johansson et al., 2001) show that the eyes do not behave as passive receivers of visual images; rather, their movements are intimately related to the requirements of future manual tasks. Overall, these studies shed new light on the neural basis of eye–hand coordination.
The experimental data we intend to account for consist of the activity of a large population of parieto-occipital neurons (Battaglia-Mayer et al., 2000, 2001) of the monkey (Fig. 1a), studied in a multiple-task approach, mimicking the natural oculo-manual behavior of every day life. Rhesus monkeys performed six different tasks, requiring different forms of eye–hand coordination and visual stimulation: reach (R), reach-fixation (RF); delayed reach (DR) in light and dark; saccade (S); and visual fixation (VF). A detailed description of these tasks is given in Section 1 of the Appendix.
It has been found that the activity of most cells was modulated by the position and direction of movement of (i) the eye and (ii) the hand, in the absence of eye movement (as in the RF task); and by (iii) coordinated eye and hand movement, both when the target location could not be predicted (as in the R task) and when it was pre-cued (as in the DR task). Neural activity was also modulated in absence of movement, i.e. (iv) in the delay interval when the animals planned hand reaches, either in light and/or in darkness, and (v) during visual stimulation. Thus, these cells fired significantly during different epochs of different behavioral tasks, with cell-to-cell variations that depended on the task. In most parieto-occipital cells, neural activity was directionally tuned (see Appendix, Section 1) during many task epochs. The corresponding preferred directions (PDs) of each neuron clustered within a limited region of the workspace, referred to as global tuning field (GTF; Fig. 1b), which was characteristic of each cell (Battaglia-Mayer et al., 2000, 2001).
Models of parietal operations (Zipser and Andersen, 1988; Salinas and Abbott, 1996; Pouget and Sejnowski, 1997; Burnod et al., 1999; Deneve et al., 1999; Pouget and Snyder, 2000) have considered properties of parietal cells, such as retinal location of a visual stimulus and eye position in the orbit, to explain how the location of a visual target can be transformed from retinal to body coordinates necessary for purposeful movements. The observation of the GTF of parietal neurons suggests an alternative coding mechanism in parietal cortex, where the representation of directional variables concerning hand and eye movement emerges from Hebbian synaptic plasticity alone. The present study is a first attempt in this direction, and the pros and cons relative to the wealth of models of parietal operations are addressed in the Discussion.
It turns out that a simple Hebbian plasticity scenario does lead to global tuning in line with experimental results. But the GTF is equally well produced by synaptic structuring in the feed-forward system leading to the parietal network as by structuring in the recurrent parietal network. We therefore describe both models and discuss potential ways distinguishing them.
The tasks performed by the monkeys allow the definition of several behavioral epochs (see Appendix), involving different forms of eye–hand coupling, some of which are characterized by static holding of the eye and/or of the hand on visual targets. Neural activity during these epochs was interpreted as reflecting eye and hand positional signals, in addition to retinal visual information. The two models tested are a feed-forward (model I; Fig. 2a) and a recurrent one (model II; Fig. 2b), both focusing on three epochs: target holding time (tht) of the R task, where we assume that the input to the network is modulated by eye and hand position (the visual signal remains constant during this epoch); tht of the RF task, where the input depends both on the visual stimulus and on the hand position; and tht of the S task, where it is only the eye position that is relevant. In the experimental paradigm, these three epochs are characterized by the presence of positional signals only. In the experimental design, there were five epochs (Battaglia-Mayer et al., 2001) in which only these signals influenced neural activity. However, two of them (tht of the DR task, in light and dark condition) can be considered behaviorally equivalent to the tht of the R task, since in both the monkey’s eye and hand stayed immobile on the visual target.
In both models presented here, it is assumed that during these three epochs parietal neurons receive an input from other areas encoding eye position (e), visual stimulus position in retinal coordinates (v) and hand position (h). From now on, we will refer to these input signals of different modalities as ‘stimuli’ to the network. The information coming from each of the input areas is collapsed into one variable signaling the direction of the input-stimulus in each modality: an angular variable φq, where q ( = e, v, h) distinguishes the different signals. For each modality q, each parietal unit will fire best for a particular value of theφq, called the preferred directionψq. The objective of both models is to account for the natural correlations observed between the preferred directions of different modalities, in terms of cross-modality Hebbian interactions generated by experience. For example, during natural reaching movements, the eye and the hand, although at different times, move together in a certain direction toward a common target. This is a cause of cross-modality structuring.
This study confirms that Hebbian structuring in situations of multi-modality input leads to global tuning, in both models. It then goes on to compare the performance of the two models and suggests experimental criteria for distinguishing between them. Both models are consistent with the general framework of the operations underlying cortical control of reaching, as in (Burnod et al., 1992, 1999).
General Structure of the Models
First we describe the common structure of the two models. Specifics for each model will be given afterwards (for technical details see Section 2 of the Appendix).
Both the feed-forward (model I; Fig. 2a) and the recurrent (model II; Fig. 2b) models share some basic properties. The first is that, prior to the introduction of interactions among different modalities, each parietal unit i has three random and independent preferred directions, one for each modality: (Fig. 2a,b). This could be due to anatomical input differences from neuron to neuron, and/or to the initial random distribution of the synaptic efficacies between the areas involved.
Secondly, both models involve a basic structure, in which each of the three modalities provides inputs to the parietal unit that are independent of its preferred direction for the other modalities. In other words, this structure (which may be innate) can be regarded as defining the tuning properties of the units. The two learning scenarios (feed-forward and recurrent) depicted below add cross-modality interactions in two different ways, building on this basic structure.
Thirdly, in both models the total external input to the parietal unit i is the sum of three unimodal and independent components ( ; Fig 2a,b, equation A3), representing eye position (e), visual stimulus position in retinal coordinates (v) and hand position (h). The current arising from each input module is the sum of the activities over a large number of units (j in Fig. 2a and equation A4). The activities in the input module, in response to a stimulus at angleφq, are modeled by a Gaussian function (equation A5) peaked at a given angle, which represents the preferred direction of unit j (q = e, v, h). The units in each input module have preferred directions evenly distributed on the circle. The relationship of parietal activity to eye position was described by a Gaussian function, instead of the more common linear (Zipser and Andersen, 1988) or sigmoid (Salinas and Abbott, 1996; Pouget and Sejnowski, 1997) curves. This is justified by the fact that the present models operate in polar coordinates, at fixed eccentricity of stimuli, to mimic the experiments analyzed.
The contribution of each single afferent unit to the total current is weighted by a feed-forward synaptic matrix , which connects module q to the parietal network (Fig. 2a; equation A6). At this stage, the synaptic matrix has no cross-modality interactions: the strength of a connection depends only on the PD of the input unit and on the one of the parietal unit for that same modality. Synapses between input units and parietal units are potentiated maximally when their preferred directions in a given modality q are nearby, with their strength falling off with increasing angular difference. Their distribution goes through an inhibitory part before tending to zero, expressing an overall inhibition. Thus, the synaptic matrix is modeled by a ‘Mexican hat’ function (Fig. 3; equation A6). This implies, for example, that if a generic input unit j encoding eye position has a similar PD to the one associated with modality e of the parietal unit i ( ), the synapse between i and j is potentiated. In contrast, if the same input unit j has its PD similar only to a parietal PD of a different modality, such as hand position, their interaction will be negligible. This will result in a current from module q that will be maximal for a stimulus located at . Such a matrix may be considered as built-in, defining the unimodal PDs in the parietal network.
The Feed-forward Model (Model I)
In this model, the interactions among modalities, cross-modality structuring, were introduced in the feed-forward synaptic matrix , mentioned above. The basic underpinning of this structuring can be explained by two considerations: (i) when a stimulus of a given modality is presented to the network, those neurons with a preferred direction close to the one of the stimulus fire at enhanced rates, both in the input area and in the parietal network; and (ii) both in life and in the experiments described above, stimuli from different modalities tend to be correlated, and therefore input signals of different modalities with equal PDs are likely to occur concurrently. This will induce synaptic modifications that tend to potentiate synapses between neurons in an input area and neurons of the parietal network that share similar preferred directions, even if these belong to different modalities. The cross-modality terms introduced in the matrix would be the result of Hebbian learning in the synaptic matrix connecting each of the input areas to the parietal network.
In the new synaptic matrix (equation A8), the cross-modality interactions may have a different strength from those expressing the unimodal ones. This is described by a parameter α that expresses the relative strength of the cross-modality contribution to the intra-modality efficacy; α assumes values between 0 and 1. For α = 0, the structuring reduces to the basic structure described above. For α = 1 the inputs to the parietal unit i from a modality q depends equally on all the three angles , and is identical for a given stimulus angle, regardless its modality (equation A9).
The output rate νi of unit i of the parietal network will be a linear function of the total input, i.e. the sum of the currents of the three unimodal components. The precise form of the transfer function is not relevant to our purposes, since we are only interested in the maximal value of the output. Thus, in what follows it will be assumed that the response of unit i coincides with the afferent current (equation A10).
The presence of the cross-modality contributions in the new synaptic matrix elicits afferent unimodal currents that are not necessarily peaked at the original PDs of a parietal unit. This will hold also for the corresponding response, as implied by the choice made for the transfer function. Therefore, three new preferred directions, , emerge as the actual peaks of the response of the parietal unit to the input from each separate modality and, in general, they will be different from the initial ones, . Examples of such shifts are shown in Figure 4, where the response of a parietal unit, due to a purely visual stimulus, is shown for different values of α. The peak value of each curve corresponds to the new PD. It is remarkable the shift of the PD with increasing values of α: from –90° (at α = 0) to 0° (at α = 1). Furthermore, for increasing values of α the three new PDs tend to cluster. This has been quantified by the maximum angular dispersion between the three peak angles. Sharper clustering implies smaller dispersion of preferred directions. Figure 5a shows the average dispersion across 1000 units of the parietal network as a function of α: for α approaching 1, the three preferred angles become identical (as shown in Appendix, equation A9).
Since the input consists only of a linear combination of the input-modalities, there is no gain effect (Andersen et al., 1985), but only an offset of the response to a given modality when another modality is modulated. A nonlinear transfer function could generate gain effects even with linear inputs. However, such a nonlinear transfer function would not affect the clustering of the preferred directions of the units, since the stimulus that produces the maximum input will also generate the maximum emission rate. This aspect will not be pursued here.
The Recurrent Parietal Network (Model II)
In the second model, the cross-modality interactions are introduced through recurrent connections between the parietal units (i, k in Fig. 2b). In this case, the total input current to a unit will be the sum of two components (equation A11):
an external input dependent on the stimuli in the three modalities as described in the basic structure. No cross-modality interactions are present in this term, as opposed to the feed-forward model;
a recurrent input from the other units of the parietal layer, consisting of their output responses weighted by a new synaptic matrix .
The current to rate transduction function of the parietal units is a standard threshold linear function of the total input. A dynamics is introduced to allow the parietal units to interact with each other and to reach a stationary state (equation A12). The responses analyzed here are the asymptotic stationary rates.
Hebbian cross-modality structuring occurs only in the recurrent synaptic matrix. The correlations among modalities mentioned above, present in the learning scenario, naturally enhance the synapse between units that have similar preferred directions, even for different modalities (equation A13). In the recurrent synaptic matrix, this cross-modality structuring is implemented with terms that reflect the potentiation (depression) of the efficacies between units with nearby (distant) preferred directions for different modalities. The terms involving two different modalities may have a different efficacy relative to the intra-modality ones, expressed again by the parameter α (0 < α < 1). In this model the pre-assigned preferred direction of a unit will be changed by the recurrent cross modality interactions. The new preferred directions are obtained as the peaks of the asymptotic response to a stimulation in each single modality.
The maximum angular dispersion is again taken to measure the amount of clustering of the preferred directions in the three modalities. The three new preferred directions are computed by varying the direction of the stimulus in each modality, while keeping the other two fixed at 0°, as in model I. In Figure 5b this quantity is shown as a function of α. For small values of α, the amount of clustering is negligible, increasing with α. For α = 1 the clustering (Fig. 5b) is not as complete as in the feed-forward case (Fig. 5a), and some scatter persists, with an average maximum dispersion of ∼40°. In the case α = 0 this model becomes very similar to that of Salinas and Abbott (1996), and is able to reproduce gain fields. This remains true also for (α ≠ 0), where gain fields accompany the clustering of the preferred directions. It is worth noting that when inhibition is removed from the network, by setting to zero the negative synaptic efficacies, cross-modality structuring does not produce clustering: in this case the network’s output resembles, for any value of α, that observed in the case α = 0.
Results and Comparison with the Experimental Data
As detailed in the Appendix, the experimental data (Battaglia-Mayer et al., 2000, 2001) were obtained by recording parietal cell activity through a multitask approach, where 21 behavioral epochs were identified. For each cell, the directional tuning properties in all epochs were investigated. Each neuron had a preferred direction for each of the behavioral epochs in which neural activity displayed significant directional tuning. For those cells with at least three preferred directions, the corresponding circular distribution has been analyzed by the Rayleigh test of uniformity (Batschelet, 1981). When the distribution of the preferred directions was unimodal, they clustered in a limited sector of the circle, the GTF.
Simulations are performed to mimic the experimental procedure. The network is run repeatedly over all possible input signals, represented by varying the eight values of the angle (φ) across the three task conditions. For each unit in the network, the model-generated response rate is computed, and a Poissonian noise is added to account in part for the rate variability of biological neurons. Then, all the responses to the same stimulus of a unit are averaged across repetitions. This is done for both models and for several values of the parameterα. Unit responses are then fitted with a cosine function, to determine if they are unimodal (R2 > 0.5) and, if so, their preferred directions. Only units that fit this criterion are further considered.
The two models have been successful in replicating the clustering of preferred directions in the three tasks considered, as seen in the experimental data. Figure 6 (left and right panels) shows the output of a network unit of the feed-forward and of the recurrent model, respectively, to three inputs reflecting the signals that come into play during the three task epochs (tht R, tht RF, tht S), as defined above.
Both models naturally reproduce output–tuning curves with a main preferred direction (Fig. 6, middle), and align them (Fig. 6, bottom), as observed in the experimental data (Fig. 7). It is seen that the model networks reproduce the features of the GTF of parietal neurons, only upon cross-modality structuring.
To evaluate the quality of the global tuning, i.e. the amount of clustering of preferred directions, a test similar to the Rayleigh test of randomness was employed: each of the PDs of every unit was represented by a unit vector; the modulus of the average of these three unit vectors, li, provided a measure of the clustering. A network unit was considered as expressing a GTF when li > 0.5. This test was motivated by the non-stability of the Rayleigh test with such a small number (three) of vectors, and by the fact that the level of 0.5 produced results similar to those obtained experimentally with many more vectors (see Battaglia-Mayer et al., 2000, 2001).
The same test has been applied to the units of models I and II, and to the experimental data (Battaglia-Mayer et al., 2000, 2001). In the data, 49/80 (61%) cells passed the global tuning test. The value of α has been set so as to reproduce this percentage in both models, starting from a sample of 1000 units in each of them. Model I reproduced the desired percentage, with α = 0.36, while model II with α = 0.53. However, for these values of the parameters both models showed a distribution of the dispersion more skewed towards small dispersions than the experimental data. This difference between both models and the experimental data could be due to different factors, such as a high level of biological noise, the existence of input signals other than the three considered here, and deviations from the strict functional relationships that characterize the input and the synaptic matrix in the models.
The main issue confronted here is a search for a mechanism that might align the effective preferred directions of cells, which receive independent input from several different (and independent) modalities. Such ‘global tuning’ would presumably have to take place against a background in which the afferent inputs from the different modalities are independent, and so are the a priori preferred directions of the ‘posterior parietal’ cells. We establish the theoretical possibility that simple Hebbian synaptic plasticity, expressing the effects of coordinated stimulation from several modalities, would lead to the alignment in the upstream module. That is, the Hebbian plasticity causes afferents from different sources to converge. Once this basic fact is established, it turns out that the mechanism is not unique — it can take place either in a feed-forward processing from the unimodal zones to the parietal module, or within the parietal module, by recurrent processing. The particularity of this situation is underlined by contrast to processing in V1, for example, where afferents from the same input area are successfully segregated into orientation columns. What is similar is the possibility that the observed phenomenon be generated by feed-forward interactions or by recurrent ones for both of which experimental evidence has been provided (for a review, see Ferster and Miller, 2000). However, it is worth stressing that for these tasks the operations in parietal and primary visual cortex are of different nature. In V1 the main computation is the identification of different features from a unique input coming from the lateral geniculate nucleus, while in the parietal cortex it could be the computation of a unique signal from inputs of different origin and nature. It is interesting that the same cortical mechanism seems to operate across different tasks, implying that studies and models concerning a given area of the cortex may be generalized to many others, in spite of their anatomical and functional differences. In any case, at the level of stationary neural activities the two models we have tested cannot be distinguished, and devising a discriminating experiment may be as difficult as in V1. But the main result remains, the Hebbian mechanism is quite natural in generating global tuning.
Both models capture a great deal of the observed data. This raises questions concerning the biological plausibility of the models, and their anatomical and physiological underpinning. The models use positionally-tuned visual, eye, and hand signals as independent inputs to the network. There is strong evidence from anatomical studies (Colby et al., 1988; Blatt et al., 1990; Johnson et al., 1996; Caminiti et al., 1996; Shipp et al., 1998; Matelli et al., 1998; Marconi et al., 2001) that such information can be conveyed to the parieto-occipital cortex by cortico-cortical association pathways originating from different independent sources. Moreover, it is well known from physiological studies (Galletti et al., 1995; Johnson et al., 1996; Battaglia-Mayer et al., 2000, 2001) that this information is very often positional in nature. The choice of representing the input as three separate independent signals appears justified in biological terms. Model I postulates Hebbian synaptic plasticity on the connections from the afferent input. Model II uses a recurrent network architecture. The synaptic connections have in both cases a structure interpretable in terms of Hebbian pre-learning, a feature believed common to brain ‘engram’ encoding. In other words, we do not implement the synaptic plasticity dynamics, but use the Hebbian logic to construct the final synaptic matrices (see also below). Most important, the networks generate an output that is in many respects similar to the GTF of parieto-occipital neurons, only when the cross-modal structuring is adopted. It suggests that the GTF may be an emergent property of the combination of independent inputs of different modalities, within the afferent or the recurrent (feed-back) connections of parieto-occipital cortex. This combinatorial power has been documented by neurophysiological studies (Battaglia-Mayer et al., 2000, 2001), and might have an anatomical substrate in the long and local cortico-cortical connections. If so, the outcome of this combinatorial process can be addressed to areas linked to parietal cortex by reciprocal association connections, such as the dorsal premotor areas of the frontal lobe (Johnson et al., 1996; Matelli et al., 1998; Marconi et al., 2001), where it can be studied to assess the evolution of the information processing flow occurring at the transition from vision to movement in the parieto-frontal network.
Naturally, the Hebbian plasticity invoked would mix the effects described for the two models. It would create, as well, selective connections between the parietal and other areas of the cortex besides the recurrent ones. Although we have not explored this scenario directly, its properties would be qualitatively similar to those of the two models considered here. The quantitative features of a mixed model should be studied in detail.
Models Predictions and Experiments to Distinguish between Scenarios
As already mentioned, the stationary data we analyze do not allow us to distinguish between the two implementations of the modeling scheme. Hence we turn to some other characteristics of data that may be discriminating and hopefully accessible.
The recurrent network requires time to relax into the stationary state, when the GTF emerges. By analyzing the dynamical responses to different input signals, it may be possible to identify which part of the GTF is due to the direct input and which part is due to the recurrent synapses (Pack and Born, 2001).
Model I requires that a significant fraction of single units receive direct afferents from all three input modalities. Model II involves no such requirement. This may or may not be verified on anatomical and physiological grounds, thus providing a test to discriminate the two models.
Model II attributes a central role to inhibition in cortical processing: it predicts that in an experiment in which inhibition is locally suppressed in parietal cortex, the GTF of individual neurons will not emerge. This would be a critical experiment to distinguish between the performances of the two models, since that of model I does not critically depend on inhibition.
Finally, in the case of a mixed model these three experiments would help to understand the relative strength and efficacy of the two components, the feed-forward and the recurrent one.
Comparison with other Models
Both models presented here intend to account for the clustering mechanism from which the GTF of superior parietal neurons emerges, and to provide a computational framework for coordinate transformations in the parieto-occipital cortex. It has long been proposed that parietal neurons are involved in coordinate transformations between different areas and frames, and most available models espouse this view. Here we have adopted a different approach, and considered the parieto-occipital cortex to be a region of multimodal convergence where the computation (e.g. planning combined eye–hand movements) is implicit, since it allows for inputs of different modalities to converge on the same neuron if they represent neighboring objects in the world (in the experiment modeled, angles on the circle, or directions in the workspace).
At this stage, the approach presented here does not provide for the ‘inverse transformations’ required for generating motor plans on the basis of the computation performed in the parietal cortex. Yet the Hebbian approach adopted envisages rather naturally the formation of a synaptic structure between neurons of parietal and premotor areas of the cerebral cortex. This synaptic structure could relate the parietal representation of incoming signals of different modalities to the motor assemblies with analog tuning properties in any one of the relevant modalities. Furthermore, the Hebbian learning process required for these synapses could use the natural correlations between activities in various modalities. This process has not been implemented dynamically in this study, alongside the neural activities present in the system. This aspect is reserved for future work.
There are some main differences between the present approach and other models.
Our model II takes off from a model of Salinas and Abbot (Salinas and Abbott, 1996) on gain fields, in which the gain effects arise naturally from recurrent interactions. Hence, model II automatically subsumes this effect, and expresses it in a pure form when α = 0, separately for each modality. Our model goes one step further in that cross-modality interactions together with recurrence produce the alignment of preferred directions we were after in this study.
For model I gain fields are not quite natural. The model performs linear summation on the inputs, and there are no recurrent interactions, thus the nonlinearity associated with gain fields must originate in nonlinearities of the output function of parietal neurons. This distinction may represent a point in favor of the recurrent model.
The approach adopted by Andersen and his colleagues (Zipser and Andersen, 1988; Xing and Andersen, 2000) proposes that posterior parietal cortex behaves as a hidden layer performing direct transformations from two or more external areas, using different reference frames (eye-centered, head-centered, body-centered, etc.; for a review see Cohen and Andersen, 2002). In these models, the hidden units produce gain fields and receptive field shifts quite similar to those measured experimentally. These models use non-local training methods and would fail to reproduce the clustering of preferred directions that leads to the GTF of superior parietal neurons, since their present learning protocol does not use correlation between modalities.
In the models of our study, the focus has been on the GTF. Note, however, that clustering implies also a certain degree of receptive field shift: when inputs of different modalities occur together, the resulting preferred direction is intermediate between those due to a unimodal input. We have not studied quantitatively whether receptive field shifts are accounted for by our model.
The approach originally proposed by Pouget and Sejnowski (1997) postulates that parietal cortex is a general system for transforming coordinates between different reference frames, and that these spatial transformations occur using basis functions. In fact, the tuning of a parietal neuron is regarded as a basis function of the product space of the different modalities. This approach has been extended (Deneve et al., 1999; Pouget et al., 2002) to include attractor dynamics, in order to allow for statistical inference among inputs. In this model, the recurrent connections are made not between parietal neurons, but rather between them and the afferent areas that are the source of inputs of different modalities. This recurrent network is also capable of reproducing the RF shifts observed experimentally (Jay and Sparks, 1987; Stricanne et al., 1996; Duhamel et al., 1997).
The tuning curve representation underlying this very imaginative model appears in contrast with the alignment of preferred directions, which characterizes most superior parietal neurons. Such alignment would lead to an incomplete basis for the representation. Yet, this apparent contrast may be reconciled by the statistics of alignment. If this were the case, this model would not contrast with the clustering of preferred directions observed in the GTF, but would not account for it.
1. Neurophysiological experiments and data
Monkeys performed six different tasks. Arm movements originated from a central position and were made toward eight peripheral targets (subtending 1.5° in visual angle) located on a circle of 7.5 cm radius (23.8° visual angle).
A red center light was first presented and the animal fixated and touched it with the hand for a variable control time (CT, 1–1.5 s). Then, one of the eight red peripheral targets was lit, in a randomized block design. Within given reaction- and movement times (RT, 0.5 s, upper limit; MT, 1 s upper limit), the animal moved the eyes and then the hand to the target and was required to keep them there for a variable target holding time (THT, 1–1.5 s), before receiving a liquid reward. In this and in all reaching tasks, RT and MT are defined relative to the hand behavior.
The monkey fixated the fixation point (consisting of two yellow vertical bars of 0.4° side, divided by a narrow black gap) located at the center of the screen and touched a red center light for a variable CT. One of the eight red peripheral targets was then lit and the center light extinguished, while the fixation point remained on. The animal had to maintain the central fixation, move the hand to the target and keep it there for a variable THT (1–1.5 s), until a 90° rotation of the fixation point occurred.
Delayed Reach Task
In this task, the preparation and execution of eye movement (D1, D2) was separated in time from execution of hand movement (RT, MT). The animals fixated and touched a red center stimulus for a variable control time. One of eight green targets was then lit, as the instruction signal (IS) for the next intended arm movement. After a variable reaction time (D1) and movement time (D2), the eye achieved the green target and stayed there for the remainder (D3) of the instructed-delay time (1–2.5 s), and during the upcoming hand movement and static holding on the target. During the entire instructed delay time (D1 + D2 + D3) the animal was required to withhold the hand movement until the green IS was turned red. This was the go-signal for the hand to reach toward the target. Within given RT and MT, the hand achieved the target and stayed there for a variable THT (1–1.5 s). The monkeys performed the task under both normal light (l) conditions and in darkness (d).
Monkeys made saccades from a common origin to eight targets in the same locations as those used in the arm reachingtasks. A fixation point was presented at the center of the workspace. During the CT, the animal pressed a key and kept central fixation. One of eight peripheral targets was then presented, in a randomized block design, and the fixation point extinguished. Within given eye RT and MT, the animals made a saccade to the target and were required to keep their eyes immobile there for a variable eye THT (1–1.5 s). The target was then extinguished and the animal had to release the key, to receive a liquid reward.
Visual Fixation Task
A fixation point was presented at the center of the workspace. During the control time the monkeys fixated the fixation point and kept the key down. A visual stimulus was then moved in one of 16 directions (22.5° angular intervals) from the periphery of the visual field inward toward the fovea and outward from the fovea to the periphery. At the end of the fixation time, the visual stimulus was extinguished and the animal had to detect a 90° rotation of the fixation point, by releasing the telegraph key. Stimuli consisted of white solid bars (3.27° × 7.60°) or of bars of static or dynamic random dots and were moved at constant speed (25°/s) during attentive fixation. Visual stimuli were presented up to 30° eccentricity.
Analysis of directional relationships
In our two-dimensional experimental set-up, the angular variable of interest is the location of the target, univocally determined by the angle α, varying from 0 to 360°. A cosine tuning function with adjustable width (Amirikian and Georgopoulos, 2000) was used to describe the relationships between cell activity and direction of movement, instead of the standard cosine function (Georgopoulos et al., 1982) used to describes neural activity
where y1 (α) is the frequency of neuronal discharge and A, K and C are the regression coefficients characteristic for each neuron.
The new function adopted, that accounts for variations in the breadth of the fitting curve, is defined as
where the transformation
has been performed to guarantee that the function is periodic, with a period of 2π.
In this model y2(α) is the frequency of neuronal discharge and A, K, C and S are regression coefficients determined by a least-squares method. The three parameters A, K and C play the same role as in the standard cosine function (C still representing the preferred direction, PD), while S is the additional parameter that controls the tuning width. This parameter defines the angular interval in which y2(α) is cosine modulated, therefore for the particular case of S = 1, (A2) reduces to (A1), while S < 1 corresponds to a broader function and, on the contrary, S > 1 to a sharper one, relative to the simple cos(α). For the goodness of fit, a coefficient of determination R2≥ 0.7 was used as threshold to assess whether or not neural activity was directionally tuned.
Since a fitting curve was calculated from average firing frequency during all the epochs of the Reach, Reach-fixation, Delayed reach (l – d) and Saccade tasks, several PDs were obtained for each neuron, with a maximum of 21 PDs for each cell. The actual total number of PDs depended on the epochs in which the cell was directionally tuned (R2 ≥ 0.7). To assess whether the distribution of the preferred directions of each neuron was unimodal, the Rayleigh test of randomness (P < 0.05) was performed. For those cells that showed a unimodal distribution, the angular deviations were calculated as a measure of dispersion (Batschelet, 1981) of their preferred directions.
2. Description of the Models
General Structure of the Models
The external input current to parietal unit i is a sum of the components corresponding to the three unimodal, independent modules:
Each , with q = e,v,h, is the sum of the activities of all units in the input modules, weighted by the feed-forward synaptic matrix:
is the activity of unit j in the afferent unimodal module of modality q in response to a stimulus at angleφq. It is modeled by a circularly wrapped Gaussian, as:
where is the preferred direction of input unit j for the modality q and is a model parameter that regulates the width of the tuning curve. This activity is maximal for . If the eccentricity is fixed and only the angle is varied, a Gaussian is quite similar to a two-dimensional sigmoid with a single orientation in the plane. This reconciles this choice with those of Zipser and Andersen (1988), Salinas and Abbott (1996) and Pouget and Sejnowski (1997).
The weight matrix is modeled by a ‘Mexican hat’ M, peaked at the preferred direction:
are the three preferred directions corresponding to the three input modalities, assigned randomly and independently to each unit in the parietal model. AEx and AIn are, respectively, the amplitudes of the excitatory and inhibitory parts of the efficacy; σE and σI are the scales on which they fall off. Such a matrix may be considered as built-in, defining the unimodal preferred directions in the parietal network.
If the number of units in the input area is large and their preferred directions are roughly evenly distributed on the circle, the input to the parietal network reduces to a convolution of a Gaussian with the Mexican hat. If the variances are not too large, this current is approximately another Mexican hat:
where , are increased variances. In what follows we will omit the tilde on the new variances.
To account for cross-modality interactions, the synaptic matrix connecting input unit j of modality q to unit i of the parietal network (equation A6) is enriched by two new terms:
where M(x) is defined in equation (A6), and q′, q″ are the two modalities other than q. The added terms have exactly the same form as the original one, but with a peak at the preferred direction of each of the other modalities. In (A8), α ϵ [0,1] expresses the relative amplitude of these terms to the original one. For α = 0, equation (A8) reduces to equation (A6), with no cross-modality interactions. In contrast, for α = 1 the inputs from the three different modalities become identical. In fact, summing over all the units j and repeating the convolution arguments that led to equation (A7), we get
The output rate νi will be assumed to be:
with no loss of generality since only the maximal properties of each unit will be under consideration in the applications of the model. Any monotone transfer function will be equivalent.
For a generic α, three new preferred directions, , are defined, as the peak of the response from each single modality.
The input current Ii to a parietal unit i is composed of a feed-forward component for each modality, as in equation (A3), and a recurrent part that is a weighted sum of the rates (νk) of the other parietal units:
where N is the number of units and is the recurrent synaptic matrix of the parietal network.
The dynamics adopted to study the recurrent network is an evolution of the rates of the units in the network, or a mean-field theory. The evolution equation for the rate νi of unit i is:
(as in Salinas and Abbott, 1996), where τ is an effective time constant, controlling the relaxation time, and T is a threshold. The expression inside the square brackets is the total current afferent on unit i minus a threshold. The actual value of τ is immaterial, since we focus on stationary states of the network and not on transients.
Upon occurrence of mixed signals, the learning will shape the weights according to a Mexican hat function, for every possible pair of modalities between the two units connected:
where the parameters are defined as for equation (A6).
The final synaptic matrix will be the combination of all these terms, with a different contribution of those involving different modalities weighted by a parameter α ϵ [0,1]:
For each parietal unit, for a generic α three new preferred directions, , are defined as the peak of the response from each single modality.
3. Modeling the Behavioral Tasks
For each task condition and each modality the distribution of unit activities in the input modules, , were given values according to equation (A5), with the particular direction of the stimulus chosen for that modality. These values were then inserted in equation (A4) for the currents, using the fixed, pre-structured input matrix of equation (A8) for the feed-forward model, and equation (A6) for the recurrent one. In the latter, these currents were used as the three afferents in equation (A11) to drive the recurrent dynamics to the stationary state of the network. For each unit in the network, the model-generated response rates are computed, and a Poissonian noise is added to account in part for the rate variability of biological neurons.
In the model, the monkey’s behavior in an epoch of a given task was implemented as follows: for unit i the afferent current is the sum of three currents, Iv, Ie and Ih. The three epochs modeled have been implemented with currents corresponding to Table 1. For example, in the tht of the R task, the position of the eye and that of the hand coincides with the position of the visual target that remains constant on the fovea. Therefore, in the afferent current (equation A3) φe = φh = 0O, …, 315O, while φv is constant; in the tht of RF task, the position of the visual stimulus and the position of the hand vary with target location (φv = φh = 0O, …, 315O), while φe does not vary; in the tht of the S task, only the eye position varies (φe = 0O, …, 315O), while the other two signals remain constant. Note that the afferent current (Fig. 6, left and right panels) is represented by the sum of at most two terms, since no more than two angles (φq) are involved in any given epoch. When there are two angles, as in the first two epochs, they are equal. The synaptic matrices are prepared, in advance, in the final form they would have assumed at the end of a training process affecting Hebbian plasticity, as described by equation (A8) for the feed-forward model, and by equation (A13) for the recurrent one. In other words, we assume that stimuli from the different modalities have been presented to the animal, separate and in combined situation, to lead to a synaptic structure, which expresses the consequences of Hebbian plasticity in the corresponding model. To forestall the possibility that the performance of the models depend on the fine tuning of the synaptic matrices, we coarse grained the distribution of the preferred directions, i.e. ( ) were discretized in 16 bins of 22.5°. This was done after we tested the models with synaptic matrices generated from continuous distributions of bare PDs. Also, the analog depth of the synaptic efficacies was coarse-grained into five discrete values (depending on the interval in which the analog value fell) after the analog synaptic efficacies were generated (see previous sections).
The parameters for the Mexican hat function of equations (A7) and (A13) have been chosen as equal in the two models, and reflect the broad tuning relationships of each individual input modality, as commonly observed in the experimental data. They are: AEx = 1, AIn = 0.2, σEx = 30°, σIn = 180°, while for equation (A12) we have chosen T = 2. The remaining parameter α was chosen for each model independently to reproduce the fraction of neurons with GTF observed experimentally (Battaglia-Mayer et al., 2000, 2001).
This study was supported by funds from the Ministry of Scientific and Technological Research, and from the Ministry of Public Health of Italy, and by the Commission of the European Communities (DG XII contract number QLRT-1999-00448). Support is acknowledged (DJA) from a Center of Excellence Grant ‘Changing Your Mind’ of the Israel Science Foundation and a Center of Excellence Grant ‘Statistical Mechanics and Complexity’ of the INFM, Roma1.
|tht R||0||1||1||φe = φh = 0O, …, 315O|
|tht RF||1||0||1||φe = φh = 0O, …, 315O|
|tht S||0||1||0||φe = 0O, …, 315O|
|tht R||0||1||1||φe = φh = 0O, …, 315O|
|tht RF||1||0||1||φe = φh = 0O, …, 315O|
|tht S||0||1||0||φe = 0O, …, 315O|
For the columns Iv, Ie, Ih, 1 means that the corresponding input stimulus is varied and 0 that it is fixed. The last column shows how the stimuli angles are varied across trials.