The Computational Anatomy of Visual Neglect

Abstract Visual neglect is a debilitating neuropsychological phenomenon that has many clinical implications and—in cognitive neuroscience—offers an important lesion deficit model. In this article, we describe a computational model of visual neglect based upon active inference. Our objective is to establish a computational and neurophysiological process theory that can be used to disambiguate among the various causes of this important syndrome; namely, a computational neuropsychology of visual neglect. We introduce a Bayes optimal model based upon Markov decision processes that reproduces the visual searches induced by the line cancellation task (used to characterize visual neglect at the bedside). We then consider 3 distinct ways in which the model could be lesioned to reproduce neuropsychological (visual search) deficits. Crucially, these 3 levels of pathology map nicely onto the neuroanatomy of saccadic eye movements and the systems implicated in visual neglect.


Introduction
Visual neglect is a common syndrome in which patients neglect one side (typically the left) of space (Halligan and Marshall 1998). It is often caused by right middle cerebral artery strokes, but has also been reported as a consequence of inflammatory (Gilad et al. 2006), metabolic (Auclair et al. 2008), and degenerative (Ho et al. 2003;Andrade et al. 2010) diseases. It has also been observed as a feature of seizure activity (Heilman and Howell 1980;Turtzo et al. 2008;Schomer and Drislane 2015), and as part of a migraine aura (Di Stefano et al. 2013). In addition to the wide range of pathological processes which can cause the syndrome, visual neglect can be caused by a range of anatomical lesions. These include both cortical  and subcortical (Karnath et al. 2002) insults. There is some evidence that the heterogeneity of the causes of visual neglect map on to distinct behavioral phenotypes (Hillis et al. 2005;Grimsen et al. 2008;Medina et al. 2009;Verdon et al. 2009), and this has the potential to be exploited clinically and scientifically.
Eye tracking provides one way to characterize behavioral deficits in visual neglect. These measurements have demonstrated that patients with visual neglect perform saccades to the right side of space with a disproportionately high frequency, compared with leftward saccades. This occurs both spontaneously (Fruhmann Berger et al. 2008;Karnath and Rorden 2012) and during search tasks (Husain et al. 2001). While these biases will form the main subject of this article, it is important to note that it may be possible to elicit signs of neglect in patients with no deficit in ocular exploration. For example, in tasks requiring a manual response, it is possible that patients may exhibit a normal pattern of saccadic eye movements, but that they may be impaired in executing a response (Ladavas et al. 1997;Bourgeois et al. 2015). In this article, we consider the control of eye movements, and the conditions that would have to be fulfilled in order to explain the saccadic patterns observed in visual neglect. We aim to show that there is a well-defined and distinct set of conditions that can reproduce the neglect syndrome.
Active inference provides a principled framework in which to define these conditions-in terms of the prior beliefs that a patient would have to possess for their behavior to be Bayes optimal. The notion of optimal pathology might seem a strange one, but the existence of a set of prior beliefs that renders any behavior optimal is mandated by the complete class theorems (Wald 1947;Daunizeau et al. 2010). This means that we can characterize pathology in terms of optimal inference, but in a system or subject that operates under a poor model of its environment (Conant and Ashby 1970). In the following, we briefly review active inference and show how this normative approach can be used to identify the functional lesions that could cause visual neglect. We then propose a neuroanatomical network that is consistent with the neuronal message passing implied by active inference. This allows us to equate functional lesions to anatomical lesions, and to simulate saccadic eye movements for each lesion in silico. We explore the influence of subcortical structures (Karnath et al. 2002) in visual neglect, and the notion that visual neglect is a type of disconnection syndrome (Bartolomeo et al. 2007;He et al. 2007). The article concludes by asking the question whether the different sorts of (saccadic) behavior induced by distinct sorts of lesions this sufficient to identify the locus of the lesion. We address this question using in silico neuropsychology and Bayesian model selection.
The purpose of this paper is to describe the active inference scheme and establish its predictive validity in (simulated) visual neglect. In subsequent papers, we will validate the underlying functional anatomy using eye tracking and MEG in real (normal) subjects. Our ultimate objective is to translate this model into clinical studies-to provide a functionally and biologically grounded characterization of neuronal computations in patients with visual neglect.

Active Inference
The formal or normative framework used to characterize hemineglect calls on the notion of active inference. Active inference provides a Bayes optimal account of perception and action by appealing to some fundamental (variational) principles that apply to any system that has evolved to maintain an adaptive exchange with its environment. In brief, to sustain their integrity, adaptive systems must minimize the dispersion of their states . Mathematically, this dispersion corresponds to entropy, and is equivalent to surprise averaged over time (Friston 2009). Surprise, in the information theoretic sense used here, is the negative log probability of making a particular observation. A surprising observation is one that is unlikely under the prior beliefs possessed by an adaptive system, or subject. An intuitive example is that of blood pressure control. Animals are compelled to keep their blood pressure within narrow bounds in order to survive. The baroreceptor system implicitly "expects" blood pressure to be within this range with a high probability, so it is surprising when pressures outside this range are sensed. It is clear from this example that surprise is something to be avoided, if life is to pursue its familiar course. To interact with the environment, without incurring potentially fatal surprises, it is necessary to possess a generative model that describes how observations are generated. In most situations, the processes generating observable outcomes are sufficiently complex that it becomes intractable to compute surprise explicitly. Instead, a quantity called free energy can be computed (Dayan et al. 1995;Beal 2003;Friston 2003). This furnishes an upper bound on surprise, through Jensen's inequality (The average of a log is less than or equal to the log of an average. This is a consequence of the concavity of the logarithmic function.).
Minimizing the free energy therefore minimizes surprise (when the bound is tight). In this equation, x represents hidden (latent) variables, andõ represents the sequence of observations made over time. The particular form of the hidden variables depends upon the generative model. (˜) P o x , describes the joint probability distribution of observations (i.e., consequences) and hidden variables (i.e., causes) under the generative model.
( ) Q x is an arbitrary distribution which becomes an approximate posterior as the free energy is minimized. Note that the first equality holds simply because ( ) Q x on the right hand side can be canceled, resulting in a marginalization of the joint distribution over all hidden variables.
The above demonstrates the (implicit or explicit) free energy minimization in adaptive systems. A system which does not minimize its free energy will fail to bound its entropy and, over time, will cease to exist; that is, will dissipate and decay. Active inference takes this further, by equipping an agent with beliefs about the policy (sequence of actions) it will pursue (Friston, Samothrakis, et al. 2012). A consequence of the imperative to minimize free energy is that, a priori, agents must believe they will minimize the free energy expected and allowable policies. Specifically, policies which are associated with a smaller expected free energy should be considered more likely than those associated with a larger expected free energy.

Markov Decision Processes
Given the discrete, serial, nature of saccadic sampling, an appropriate model structure-for the purposes of this articleis a Markov decision process (MDP) (Mirza et al. 2016). These models are defined in terms of a discrete state space, with observations made at discrete time points. The generic structure of an MDP is shown in Figure 1. The hidden variables in this model are the hidden states, τ s , the parameters of the likelihood mapping, A, and the policy π . The free energy can be expressed in terms of these unknown or hidden quantities (The notation [ ⋅ ] E Q means the expectation under the distribution Q.).
An MDP is structured such that observable outcomes depend only on the hidden states. The probabilistic mapping from hidden states to outcomes is expressed in the matrix A, in which . The hidden states depend only on the previous hidden state, and on the transition matrix B, which is a function of the policy. Preferences are specified in terms of the prior beliefs an agent has about the outcomes they will observe, and these are contained in the matrix C. D determines the probabilities of the initial states. The vectors E and G correspond to prior expectations about policies and expected free energy respectively (please see below). This generative model allows the factorization of π (˜˜) P o s A , , , into conditionally independent factors. Using a "mean field approximation," we can additionally factorize π (˜) Q s A , , into approximately independent factors. It is then possible to derive update equations for each factor of Q , by taking the derivative of the (variational) free energy with respect to that factor, and setting the result to zero (see Appendix for the derivation of the hidden state updates). In doing so, the update equations shown in Figure 2 are obtained . Reassuringly, when the variables in these equations are mapped out in terms of the influence each has over the others, the emergent structure closely resembles the architecture of cortical microcircuits, cortical hierarchies, and even corticosubcortical loops involved in policy evaluation. This loop is consistent with the structure and function of the basal ganglia (Jahanshahi et al. 2015). See Figure 2.

Memory and Short-Term Plasticity
Given that the probability distributions are specified as categorical distributions, the appropriate conjugate prior for the likelihood A matrix is a Dirichlet distribution. This means that the probability can be represented simply in terms of Dirichlet concentration parameters. For each state, = τ s j, there are a set of Dirichlet parameters, a ij , one for each outcome, = τ o i, which could be associated with this state. These  are initially "pseudo-observations," as no observation has yet been made. The belief about the probability of outcome i given state j is as follows: As observations are made, the agent is able to learn-that is, accumulate its Dirichlet parameters-to better fit its observations. This process of learning simply involves increasing the Dirichlet parameter representing a particular outcome when that is observed (Beal 2003;Blei et al. 2003). The amount it is increased by the (approximate) posterior probability that each hidden state was occupied when the observation was made. This allows an agent to remember the observations they sampled when they believed they were in a particular state (Friston, FitzGerald, Rigoli, Schwartenbeck, O'Doherty, et al. 2016). The notion that the mapping between representations of 2 variables should be increased when the 2 are simultaneously active is strikingly similar to Hebbian plasticity (Hebb 1949;Brown et al. 2009). This analysis suggests that this form of memory could be implemented by short-term changes in synaptic efficacy.
An important consequence of the Dirichlet parameterization concerns the scaling of parameters. The scaling of the Dirichlet parameters does not influence the values in the likelihood matrix. However, it does influence the degree to which these change following an observation. If all the concentration parameters are very large (as would be the case if many past observations had been made), a single observation will make a very small difference to the likelihood, A. If the parameters are very small, an observation can trigger one-shot learning, suggesting a rapid short-term plasticity effect. Such effects have been proposed as one mechanism underlying working memory (Mongillo et al. 2008). This behavior is of particular interest in the current context, as will become apparent in the next section, where the form of the MDP used to model hemineglect is described.

Saccadic Cancellation Task
The task performed by the particular MDP model used in this work is based on the pen-and-paper line cancellation task (Albert 1973;Fullerton et al. 1986). This task is used to assess visual neglect clinically, and is very sensitive (Ferber and Karnath 2001). Despite its popularity, it is worth noting that there are many possible reasons that performance of this task might be impaired. We will demonstrate this for a few of these reasons below. We will use a saccadic version, which involves presenting the subject with an array of targets that can be placed at various locations on an 8 × 8 grid. The task is to look at each of the targets until all targets have been sampled (i.e., canceled). When a target has been fixated, it changes color from black to red (see the right panel in Fig. 3), indicating that it has been seen. The model used to emulate this behavior is shown in Figure 3, in terms of the variables in the MDP. The only hidden states in this model correspond to the location currently foveated. An identity matrix maps these deterministically to proprioceptive observations, ensuring there is no uncertainty about the hidden state (i.e., where the subject is currently looking). The uncertainty in the model is contained in the (likelihood) mapping from hidden states to visual outcomes. There are 3 possible outcomes: no target (white), target (black), and canceled target (red). The prior preferences of the simulated agent are that it has equal preferences over all proprioceptive outcomes, prefers to see targets that have not been canceled, and does not expect to see targets that have already been canceled. The subject begins with (almost) uniform beliefs about the A matrix (i.e., what will happen if she looks at a particular location). However, these incorporate very weak, but accurate, beliefs concerning the locations of the targets. On foveating a target, the first visual outcome is a black target. This observation allows the appropriate Dirichlet parameters to be accumulated. During fixation, the target changes from black to red, and this causes further changes in the Dirichlet parameters, so that the subject remembers she has already canceled that location. This implements a synaptic form of spatial working memory (Mongillo et al. 2008). The subject may saccade to any location at any time, meaning there are 64 possible actions, each of which is associated with a corresponding transition (B) matrix. Having established the basic form of the generative model, sufficient to simulate visual search, we now turn to the finer details of the implicit epistemic foraging and how salient targets are selected-and what can go wrong under pathological priors.

Computational Neuropsychology
In principle (under the complete class theorem), all neuropsychological syndromes can be formulated in terms of active inference. The challenge is to find the prior beliefs a subject The start location, is specified by D. The agent may saccade to any location on the grid (three possible saccades are shown), and the particular saccade is defined by u, which selects the appropriate B matrix. Each component of this matrix defines the probability of a saccade to a given location (j k , , or l in the figure), given a current location (i in the figure). There are 2 A matrices which provide a probabilistic mapping from the hidden states to the visual (A 1 ) or proprioceptive (A 2 ) outcome modalities. Prior preferences are defined by the C matrices, which are defined for each modality. On the right is a depiction of the structure of the task resulting from the generative model. The dotted line is the saccade path, and this demonstrates the change from black to red of targets as they are canceled. would have to possess to render their behavior Bayes optimal. For visual neglect, we consider the abnormal patterns of saccadic eye movements in patients (Husain et al. 2001;Fruhmann Berger et al. 2008;Bays et al. 2010;Karnath and Rorden 2012), and the beliefs which would engender these patterns. For each saccadic policy, the generative model specifies the prior probability that the policy will be pursued. By analysing the form of this prior belief, one can develop a differential diagnosis for the computational lesions in visual neglect. As noted above, the prior belief about policy should depend on the expected free energy. The smaller the expected free energy under a policy, the more likely it will be pursued. We can express this formally as follows: In this equation, σ (⋅) is a softmax function, which ensures the resulting distribution will sum to one, making it a proper probability distribution. The scale parameter γ is an inverse temperature parameter, which acts as a precision over policy priors. π ( ) G is the expected free energy associated with each policy π . This is defined as the sum of expected free energies for each future time point.
The expected free energy has a similar form to that of the variational free energy. However, there are 2 key differences. The first is that it must be conditioned on the policy pursued, and the second is that, by definition, future observations have not yet been made. This means the expectation in Equation 2 must now include beliefs about future outcomes. Defining , , allows us to express expected free energy as follows: l n , l n , , 6 Q Rearranging this, we can separate out the key terms that influence policy selection in the generative model described above.
The second (salience) term in this equation, in the context of the generative model used here, is identical for all policies. This is because the identity mapping from the hidden states representing locations to the proprioceptive outcomes allows the subject to infer location in visual space with no certainty. This means there is no information gain or epistemic value that would otherwise resolve uncertainty about the hidden state. The key terms that determine policy selection are the first and third. The former implies that a policy which is expected to fulfill the agent's prior beliefs (preferences) about outcomes has a lower expected free energy than one which does not. The latter suggests that a policy which affords the greatest change in the beliefs about the likelihood mapping, π ( | ) τ P A o , , from beliefs prior to seeing counterfactual outcomes, π ( | ) Q A , has the lowest expected free energy. Heuristically, policies that elicit observations that enable large Bayesian belief updates become more attractive. In other words, the subject will be attracted to novel contingencies that resolve uncertainty about the consequences of being in a particular state; that is, the likelihood mapping.
In short, prior preferences and novelty are both important factors in determining the selection of a location to saccade to. This implies 2 possible computational mechanisms for visual neglect. A subject may have a prior belief that she will experience the proprioceptive outcomes corresponding to the right side of space with a greater probability than those corresponding to the left. Alternatively, the subject may be more confident in her beliefs about the mapping from states to outcomes on the left, and therefore consider the right side of visual space novel. This is equivalent to starting with very large Dirichlet parameters (corresponding to a large number of pseudo-observations) for locations on the left. This follows because an observation resulting from a saccade to the left will induce a small change in beliefs about the likelihood mapping.
A third possibility relates to (baseline) prior beliefs about policies that may not depend upon expected free energy. Although active inference mandates that an agent believes it will pursue policies which minimize its expected free energy, it does not preclude fixed prior beliefs over policies which, in visual neglect, might identify saccades to the right to be a priori more likely than those to the left. To express this formally, we can augment the expression for priors over policies as follows: Here, E expresses the prior beliefs about policies that do not depend on the expected free energy. In this form, the (log) priors over policies are expressed as a linear function of expected free energy, where E corresponds to the y-intercept and precision is the sensitivity or slope.
In summary, the above formal considerations have led us to identify 3 possible synthetic lesions which could give rise to visual neglect. These are changes in the priors over policy E, the Dirichlet parameters of the beliefs about A 1 , and the priors concerning proprioceptive outcomes, contained in the matrix C 2 . In the following section, we review plausible neurobiological substrates for each of these computational pathologies.

The Neuroanatomy of Hemineglect
The Dorsal Attention Network The superior colliculus, in the midbrain, is a key site for the control of saccadic eye movements (Raybourn and Keller 1977). It is also a point of convergence for the cortical and subcortical structures involved in oculomotor control (Künzle and Akert 1977;Berson and McIlwain 1983;Fries 1984Fries , 1985Shook et al. 1990;Gaymard et al. 2003). The substantia nigra pars reticulata, a GABAergic output nucleus of the basal ganglia, projects directly to the colliculus (Hikosaka and Wurtz 1983), as do cortical areas including the frontal eye fields (Künzle and Akert 1977) and the lateral intraparietal cortex (Gaymard et al. 2003) (sometimes called the parietal eye fields (Shipp 2004)). These dorsal frontal and parietal areas constitute the dorsal attention network , and communicate via the first branch of the superior longitudinal fasciculus (Makris et al. 2004;Bartolomeo et al. 2012). The frontal eye fields are well placed to house the hidden states representing eye position, while dorsal parietal areas are suited to the representation of proprioceptive information. The former are known to contain spatial maps in egocentric space, as evidenced by demonstrations that stimulation of neurons in this region induce saccades that end in specific egocentric eye positions (Bruce et al. 1985;Sajad et al. 2015). The latter contain neurons that are modulated by multiple spatial reference frames (Andersen et al. 1985;Pouget and Sejnowski 2001).
The parietal cortex is part of the dorsal visual stream, thought to carry information about the location of a stimulus (Goodale and Milner 1992;Ungerleider and Haxby 1994). In the present context, the first branch of the superior longitudinal fasciculus would perform a coordinate transformation, bringing spatial information about a stimulus into egocentric coordinates; suitable for planning eye movements. This suggests that the superior longitudinal fasciculus corresponds to the connectivity or mapping encoding the likelihood matrix A 2 (Fig. 3). In our model, this is an identity mapping, but this is only the case when the head is assumed to be in a fixed position. A model which allowed for head movements would require this matrix to represent a more complex coordinate transform. Given proprioceptive outcomes are represented in the dorsal parietal regions; inputs to this region must represent prior beliefs concerning proprioception. A candidate structure providing this information is the pulvinar, which is involved in visual search behaviors (Ungerleider and Christensen 1979). The connections from this region would then encode the C 2 matrix.

The Ventral Attention Network
Despite the important role of dorsal frontoparietal areas in the generation of saccadic movements (Corbetta et al. 1998), it is more ventral frontoparietal lesions which are associated with the visual neglect syndrome (Corbetta et al. 2000;Shulman 2002, 2011). These regions are the constituents of the ventral attention network, and are connected by the third branch of the superior longitudinal fasciculus (Rushworth et al. 2005;Bartolomeo et al. 2012). The parietal part of this network includes areas in the region of the temporoparietal junction, closer to the temporal regions associated with the ventral visual stream. This component of the visual system has been described as the "what" pathway (Ungerleider and Haxby 1994), propagating information concerning stimulus identity to complement the "where" information of the dorsal stream. The ventral temporoparietal regions are then good candidates for the representation of the visual outcome modality of the model, allowing them to influence eye movements in a stimulus-driven manner (Shomstein et al. 2010). Connections from the ventral frontal cortex could then carry information concerning prior beliefs (equivalent here to the instructions a subject would be given), consistent with the proposed role of areas in this region in representing task demands (Duncan 2001;Dosenbach et al. 2006) and in target detection (Stevens et al. 2005). This suggests that the third branch of the superior longitudinal fasciculus is the anatomical substrate of C 1 .
Notably, the ventral attention network is lateralised to the right cerebral cortex, while the dorsal network is much more symmetrical Thiebaut de Schotten et al. 2011;Vossel et al. 2012). This is consistent with the notion that temporal regions could represent the "what" modality, as identity is largely independent of location, and therefore does not require a bilateral representation . There is evidence to suggest that this unilateral representation of identity is right lateralised James 1967, 1988;Warrington and Taylor 1973), while left sided homologues relate to object naming (Kirshner 2003). We note that, although temporoparietal regions are thought to play a role in target detection (Corbetta et al. 2000), they do not appear to be necessary for object recognition. The involvement of the ventral network is consistent with the fact that visual neglect is frequently associated with right hemispheric lesions.
This leaves the question of how lesions in ventral regions produce the saccadic deficits that might be expected from dysfunction of areas which are directly involved in saccadic control. One answer to this question is that visual neglect involves dysfunction of the dorsal network as a consequence of the failure of the ventral network, or of the interaction of the 2 networks (He et al. 2007). The 2 networks are joined by the second branch of the superior longitudinal fasciculus (Thiebaut de Schotten et al. 2011), and it has been proposed that visual neglect represents a functional disconnection syndrome involving this pathway. Given that this branch connects the parietal part of the ventral system to the frontal part of the dorsal system, this corresponds exactly to the mapping described by A 1 . It is interesting that this tract, heavily implicated in visual neglect (Doricchi and Tomaiuolo 2003;Thiebaut de Schotten et al. 2005), appears to be the anatomical homologue of the mathematical entity identified above as a candidate for pathological priors-on purely theoretical grounds.

Subcortical Structures
As stated above, an important input to the superior colliculus is the substantia nigra pars reticulata. This structure is a point of convergence for the direct and indirect pathways through the basal ganglia. Both of these originate from the striatum, which comprises the caudate nucleus and putamen. In visual neglect patients with subcortical lesions, there is substantial lesion overlap found in the putamen, and to a lesser degree in the caudate (Karnath et al. 2002). As indicated in Figure 2, the putamen is involved in the evaluation of policies. This fits with the proposed role of the basal ganglia. Additionally, as policies that are independent of the expected free energy are equivalent to habitual behavior, it makes intuitive sense that pathological biasing of policies would take place within a structure which is involved in habit formation; that is, the striatum (Yin and Knowlton 2006). The consistency of the anatomy of the basal ganglia with the policy update equations is further enhanced when the hierarchical extension of these equations is considered ). These imply multiple parallel loops, originating and ending in the cortex, closely resembling those described in subcortical structures (Haber 2003).
The pulvinar is another subcortical region that is strongly implicated in visual search and neglect-and, as mentioned above, connects to dorsal parietal areas (Weller et al. 2002;Behrens et al. 2003). This makes it a plausible anatomical substrate for the representation of prior beliefs about proprioceptive outcomes. This is consistent with accounts of the pulvinar in directing attention (Shipp 2003;Kanai et al. 2015) and eye movements (Petersen et al. 1985), and as a "salience map" (Robinson and Petersen 1992 ;Veale et al. 2017).
There are other possible lesions which could be accommodated by this model. For example, unilateral disruptions of the connections from the substantia nigra pars reticulata to the superior colliculus (Schiller et al. 1980(Schiller et al. , 1987Hikosaka and Wurtz 1985a), or of the dopaminergic modulation of the striatum Kori et al. 1995), have been shown to cause visual neglect-like syndromes. However, these lesions are rarely reported as causes of neglect in human patients. We have prioritized the lesions corresponding to the white matter tract that connects the dorsal and ventral attention networks, in addition to 2 common subcortical lesions; the putamen and pulvinar. These closely resemble the theoretically motivated lesions of A 1 , E, and C 2 .

Interhemispheric Interactions
In the above, our focus has been on disruption of the communication between posterior and frontal cortices, and on subcortical disconnections within the right hemisphere. Importantly, there is good evidence (Vuilleumier et al. 1996;Rushmore et al. 2006;Dietz et al. 2014) that neglect involves interhemispheric imbalances in addition to intrahemispheric disruptions (Bartolomeo et al. 2007;Bartolomeo 2014). This is a key feature of an existing model of neglect (Kinsbourne 1970). Fortunately for our framework, the 2 are inherently linked. Examination of the equations in Figure 2 (and the Appendix) reveals 2 key features in the belief updates for hidden states. The first feature is that beliefs about states are conditionally dependent upon policies. This means that any bias towards policies favouring saccades to the right will increase the probability, on taking a Bayesian model average over policies, of a fixation location on the right. Given the contralateral cortical control of eye movements, this corresponds to increased left hemispheric activity. The second important computational feature is the softmax function, which ensures posterior beliefs over allowable fixation locations sum to one (i.e., ensures a proper probability distribution). Such a constraint could be biologically implemented by inhibitory interactions within and between the 2 frontal eye fields. In other words, if fixations on the right side of space are considered more probable, it must be the case that leftward fixations are less probable. This necessarily implements a form of interhemispheric competition-a competition that is won by the left hemisphere if any of the lesions described in the previous section bias policies towards rightward saccades (Fig. 4).

Simulating Hemineglect
Heterogeneous Pathology to Homogenous Syndrome Figure 5 shows the results of running the simulation for 20 saccades, under different prior beliefs (i.e., lesions). Strikingly, all 3 lesioned models produce very similar behavioral patterns. This heterogeneity of functional lesions is consistent with the diverse set of anatomical lesions known to cause visual neglect. While the nonlesioned model samples both sides of space, all 3 lesions cause a bias towards sampling the right side of space. This biased sampling is very similar to that observed in visual neglect patients (Bays et al. 2010). It is worth noting that people may have additional priors over their policies (contained in E), that result in a slightly different pattern of saccadic search than that depicted in Figure 5. For example, people might have a prior bias towards performing a saccade to a nearby target. We have omitted this additional prior, as our aim is to present a minimal model that reproduces the important features of neglect.
The functional disconnection induced by altering the Dirichlet parameters of A 1 effectively increases the novelty associated with saccades to the right hemifield. This corresponds to the functional disconnection of the dorsal and ventral attention networks, and can be thought of as impairing the "capture" of attention by salient stimuli, consistent with existing theories of visual neglect (Ptak and Schnider 2010) and attention (Shulman et al. 2009). The simulated pulvinar lesion causes the agent to fulfill their prior beliefs that they are more likely to be looking at the right side of space, and the lesioned putamen biases policy selection in favor of saccades in this direction. posterior parietal areas in the region of the lateral intraparietal area (LIP) and intraparietal sulcus (IPS). The frontal areas of this network are assumed to represent the hidden states, corresponding to the current fixation location. The parietal component represents proprioceptive outcomes (eye position). The connection between these frontoparietal areas is the first branch of the superior longitudinal fasciculus (SLF I), mediating the likelihood mapping between the hidden states and proprioceptive outcomes (A 2 ). The ventral attention network includes the ventral frontal cortex (VFC) and the temporoparietal junction (TPJ). These are connected by SLF III, which could carry prior preferences about visual outcomes (C 1 ). Visual outcomes are assumed to be represented in the TPJ, which suggests the SLF II is the mapping from hidden states to visual outcomes (A 1 ), and it is in these connections that the beliefs about the target locations are encoded. Prior preferences for proprioceptive outcomes are assigned to the pulvinar, a nucleus of the thalamus. On the right the connections from the pulvinar to the dorsal parietal cortex (LIP) are shown. These are portrayed as conveying expectations about (proprioceptive) outcomes in C 2 . In addition, the pathways through the basal ganglia are also shown. The policy evaluation processes shown in Figure 2 are depicted as stages in the direct pathway. In this scheme, the putamen evaluates the expected free energy, and baseline policy priors, E. These are modulated by dopaminergic inputs from the substantia nigra pars compacta , in proportion to their precision γ , and the output of the putamen is transformed by the substantia nigra pars reticulata into a distribution over policies. The simulated lesions we considered are numbered: 1-SLF II; 2-Putamen; 3-Pulvinar. As in the previous figure, red connections are excitatory, blue inhibitory, and green modulatory.
While mechanistically distinct, the behavioral profiles of each of these lesions do not appear to lend themselves to precise diagnoses in terms of observable behavior. In the next subsection, we consider a more realistic approach to spatial representations. We follow this with an attempt to determine whether the syndrome generated by these lesions is really as homogenous as it appears, or whether it is possible to identify the lesion from saccadic behavior.

Representing Visual Space
Spatial representations in the brain involve multiple segregated spatial scales, and resolutions. The magnocellular system, for example, carries information with a relatively low spatial resolution, while the parvocellular system provides higher resolution information (Livingstone and Hubel 1988;Shipp 1988, 1989;Nealey and Maunsell 1994). Visual neglect provides further evidence for the brain's use of multiscale spatial representations. One example of this is the Ota search task (Ota et al. 2001) in which participants are asked to identify, from an array of shapes, which shapes are complete. While some visual neglect patients fail to address any of the left hand side of the array ("egocentric visual neglect"), others address all shapes, but are impaired in determining which shapes are complete ("allocentric visual neglect"). Specifically, those shapes which have a deficiency on their right side are correctly identified as incomplete, while those deficient on their left side are incorrectly identified as complete. While many accounts have described the 2 perceptual deficits in terms of different spatial reference frames (Medina et al. 2009), it has been argued that both forms are actually different manifestations of an egocentric visual neglect (Driver and Pouget 2000;Corbetta and Shulman 2011). If both are considered to take place in the same reference frame, the 2 behavioral patterns would be consistent with visual neglect operating at a coarse spatial scale in the first case, and a finer scale in the second.
Equipping the model with a multiscale representation is simple to do in our generative model; instead of representing each of the 64 locations at a high resolution, we can encode each location using 3 levels (i.e., factors) of resolution, each level divided into 4 quadrants that, collectively, specify 4 3 = 64 locations. Technically, this means the A matrix now becomes 3 matrices encoding the likelihood mappings at low, intermediate, and high levels of resolution. Functionally, this means that the subject perceives visual input at 3 levels of resolution-and can entertain uncertainty (and novelty) at any level. This also means we have the opportunity to model pathological (prior) biases at the level of quadrants of the visual field, quadrants within each quadrant and quadrants within those quadrants. Figure 6 shows a multiscale representation in our model, and its application to the saccadic cancellation task. The right panel shows how the Ota task was used to motivate this approach. If visual neglect is induced at a coarse scale-that is, quadrant enclosed by the blue frame-an egocentric behavioral pattern of saccadic sampling would be expected. However, if induced at a finer scale (green frame), neglect would cause an allocentric pattern. Figure 7 shows the simulated eye tracking data generated under this multiscale representation. Lesions are shown at each spatial scale and are induced by scaling the corresponding Dirichlet concentration parameters. The other 2 types of functional lesion produce similar results. Crucially, different spatial scales of visual neglect could reflect different lesion topologies, as more ventral lesions have been associated more with neglect at the object scale (Grimsen et al. 2008;Medina et al. 2009;Verdon et al. 2009).
A simplification we have made in the generative model we have used is that we have assumed the head position is stationary. This allows us to treat the coordinate transform, performed by the first branch of the superior longitudinal fasciculus, as an identity transformation. If we did not make this assumption, the transformation would have to be modulated by a set of hidden states representing the head position, as in established models of parietal contributions to attention Sejnowski 1997, 2001). The influence of head position over the reference frame-in which neglect is induced-allows for the possibility of different egocentric coordinate systems. However, it may be that a set of egocentric reference frames are insufficient, on their own, to explain some neglect phenomena. There is evidence that the orientation of the axes of reference frames can be influenced by the spatial configuration of visual stimuli (Driver et al. 1994;Li et al. 2014), but the inferences involved in these processes lie outside the scope of this article. Importantly, deficits that are classically described as "object-centered" are rarely seen in the absence of "egocentric" deficits Yue et al. 2012). This suggests that such deficits are not an essential part of the neglect syndrome, but may occur in larger lesions that compromise additional connections.
In summary, we have seen that a normative (active inference) model of visual searches and biased (visual) sampling can provide a sufficient, if minimal, account of the functional deficits observed in patients during line cancellation tasks. The computational architecture and message passing implied by the active inference scheme is remarkably consistent with the known functional anatomy of visual search and saccadic eye movements-and the deficits in epistemic foraging seen in patients with neglect. In the final section, we turn to the practical issues of using this sort of model to make inferences about lesions on the basis of saccadic eye movements.

Computational Lesion Deficit Analysis
We have established that, just as with anatomical lesions, there are several functional lesions that can induce very similar behavior. This raises an important question. Is the mapping from lesion to behavior truly a many-to-one mapping? In other words, is it possible, given the (simulated) behavioral data, to determine which lesion model generated it? If so, this could have important implications for clinical diagnosis, as it would allow the separation of distinct functional categories of visual neglect.
To answer this question, we used synthetic eye tracking data from each of the lesion models. To assess the ability of the paradigm to disambiguate among lesions, we computed the log likelihood of simulated behavior for every combination of lesion and model. This log likelihood or evidence was computed by summing the likelihood of each saccade under the posterior probability of saccade, under each MDP model. Clearly, in a practical application, one would need to estimate (subject specific) parameters that best accounted for the observed behavior . However, in this instance, there are no unknown parameters and the log evidence for any given model reduces to the expected log likelihood, under that model. Given that we know each set of lesion data was generated by one of the models, we can calculate the posterior probabilities of each model using a softmax function of the likelihoods for each synthetic dataset.
The results of this Bayesian model comparison are shown in Figure 8. It is clear from the confusion matrix shown in the figure that one can reliably disambiguate between health and pathology. Furthermore, the lesions in A 1 (i.e., a synthetic disconnection between dorsal and ventral attentional systems), although visually very similar to those of the other 2 lesion models, produce a characteristic behavioral pattern, allowing the lesion identity to be recovered. We aim to use this disambiguation to provide an empirical test of our anatomical model. Figure 6. Multiscale representations of space. In the illustration on the left, 2 fixation points in a sequence of saccades are highlighted. This is to demonstrate their representation in terms of a multiscale spatial state space. In the center left, this state space is shown for each fixation point. This specifies a location in an 8 × 8 space, as before. However, the location is specified in terms of which quadrant (blue), which subquadrant (red) and which subsubquadrant (green) the location is found. These 3 specifications constitute the hidden states of the multiscale model. An advantage of this model is that it allows visual outcomes to be defined at different resolutions. This is shown in the center right. Each outcome corresponds to the density of targets in the quadrant, subquadrant, and subsubquadrant currently fixated. Darker shades indicate a greater density. Note that the finest resolution is at the level of individual locations, so density is equivalent to the presence or absence of a target. Canceled targets appear red at this level only-lower resolutions are considered to be color-blind; consistent with the properties of the magnocellular system (Hubel and Livingstone 1987). As a saccade is made from a quadrant containing 3 targets to one containing 4, the lowest resolution (blue frame) outcome becomes denser. Similarly, the subquadrant representation (red frame) becomes darker, as a subquadrant containing only one target is followed by a subquadrant containing 2. The finest resolution (green frame) represents the maximum density (one target) for both fixation locations. The illustration on the right motivates the multiscale representation in terms of the Ota task. This shows one quadrant of an array of shapes. If the blue frame was biased towards occupying the right side of the array, this would resemble an egocentric hemineglect. If the green frame were biased towards the right, this would be closer to an allocentric hemineglect. Figure 7. Lesions at different spatial scales. By changing the number of the initial Dirichlet parameters, we have simulated hemineglect at 3 resolutions. As can be seen in the above, the course scale representation biases saccades to the right side of the array, similar to the patterns seen in Figure 5. The medium scale representation biases saccades to the right side within each of the 4 quadrants of visual space. Hemineglect at the finest scale biases saccades to the right of each subquadrant (comprising 4 possible locations). For larger targets, but the same spatial scales, each of these biased sampling policies would produce results very similar to those observed in patients performing the Ota task.
If we use transcranial magnetic stimulation to disrupt the communication between dorsal frontal and right temporoparietal regions, we expect to find that a lesion deficit analysis using eye movements will find greater evidence for an A1 lesion than for any of the other lesion models. Lesions of E and C 2 could not be disambiguated from one another using only synthetic saccades. That the latter model has a greater posterior probability for both sets of simulated data suggests that this is a simpler explanation for the data, and has incurred a lower complexity penalty during Bayesian model comparison. However, they were clearly identified as being abnormal, and not due to simulated lesions of the superior longitudinal fasciculus; that is, A 1 . This suggests that distinguishing between the 2 may require an additional data modality, such as reaction time, pupillometry, or electrophysiology.

Theoretical Neurobiology
The work presented in this paper closely relates to a number of recent advances in theoretical neurobiology. We have built upon previous formulations of visual exploration under active inference (Friston, Adams, et al. 2012;Mirza et al. 2016), but there are number of important distinctions between these accounts and the current work. The first is that we have used active inference to address the impact of (computational) lesions, and to demonstrate how neuropsychological disorders can be described functionally in terms of pathological priors.
The second difference is subtle: the formulations mentioned above involved the selection of saccadic targets to minimize uncertainty. This is shared with the current work, and with earlier theories of visual salience (Itti and Baldi 2006), but the quantities about which the simulated agent is uncertain differ. Previously, we have emphasized uncertainty about hidden states in the environment. Here, we focus on the uncertainty about the relationship between hidden states and their sensory consequences. It is this important difference that facilitates the analysis of disconnection syndromes, as these can be formulated as disruption of sensorimotor contingencies.
Previous models have addressed attentional processes in general (Bundesen 1998;Heinke and Humphreys 2005), and neglect specifically (Kinsbourne 1970;Heinke and Humphreys 2003). Our approach complements many of these models, while making use of more recent theoretical developments. The belief update scheme we have employed has been used to reproduce a range of other behaviors Moutoussis et al. 2014;Friston et al. 2015), physiological responses , and pathologies (Schwartenbeck, FitzGerald, Mathys, Dolan, Wurst, et al. 2015), emphasizing its plausibility as a description of brain function. Additionally, our use of active inference allows us to appeal to a physiologically plausible process theory , that facilitates the formation of empirical hypotheses about electrophysiological data. For example, we would expect that there would be an increase in the effective connectivity (in healthy subjects) between the regions connected by the second branch of the superior longitudinal fasciculus as Dirichlet parameters are accumulated. We anticipate that this should be reflected in the activities of neurons in these brain regions, and that dynamic causal modeling for evoked responses (David et al. 2006) provides a means to test this hypothesis experimentally. The simulation of eye movements adds to this, as we can use this behavioral data to complement imaging data as in previous experimental work in this area ).

Conclusion
Visual neglect can be formulated as a computational bias in an active inference scheme that can be quantified in terms of abnormal prior beliefs. In the above, we identified 3, theoretically motivated, functional lesions. On defining a generative MDP model that performed a cancellation task, we found that the connectivity implied by the model structure corresponded well to the anatomy of the dorsal and ventral attention networks, in addition to their subcortical influences. The for each model, m (columns), given synthetic eye tracking data,õ generated from each model (rows). This is equivalent to the (log) likelihood or model evidence, as there were no unknown parameters.
These results were generated using multiscale representations with lesions at the coarsest resolution in all cases. On the right is the matrix of posterior probabilities ( |˜) P m o . This is obtained from the matrix on the left, using a softmax function applied to the log evidence is in each row (i.e., for different models of each synthetic dataset). functional lesions in this anatomical assignment matched lesions associated with visual neglect; namely, in the second branch of the superior longitudinal fasciculus, the putamen, and the pulvinar. The saccadic behavior generated under these lesion models closely resembles that of patients with visual neglect. To provide a more realistic spatial representation, we used a multiscale encoding of visual state space, which implements a multiscale resolution. This allowed us to demonstrate visual neglect at different scales. Encouragingly, although the saccadic behaviors appeared homogenous across each lesion model, we found that we could recover distinct groups of lesions by comparing the evidence for each lesion in synthetic data. In principle, this demonstrates that computational phenotyping of visual neglect patients is possible.

Notes
Conflict of Interest: The authors have no disclosures or conflict of interest.