The fundamental question as to whether the neural correlates of any given conscious visual experience are expressed locally within a given cortical area or more globally within some widely distributed network remains unresolved. We inquire as to whether recursive processing—by which we mean the combined flow and integrated outcome of afferent and recurrent activity across a series of cortical areas—is essential for the emergence of conscious visual experience. If so, we further inquire as to whether such recursive processing is essential only for loops between extrastriate cortical areas explicitly representing experiences such as color or motion back to V1 or whether it is processing between still higher levels and the areas computing such explicit representations that is exclusively or additionally essential for visual experience. If recursive processing is not essential for the emergence of conscious visual experience, then it should also be possible to determine whether it is only the intracortical sensory processing within areas computing explicit sensory representations that is required for perceptual experience or whether it is the subsequent processing of the output of such areas within more anterior cortical regions that engenders perception. The present analysis suggests that the questions posed here may ultimately become experimentally resolvable. Whatever the outcome, the results will likely open new approaches to identify the neural correlates of conscious visual perception.
The raw sensations of achromatic brightness, color and motion together with other subjective or ‘phenomenal’ sensory experiences are often collectively referred to as ‘qualia’. Several groups (Stoerig and Cowey, 1995; Pollen, 1999; Lamme, 2001; Stoerig, 2001) have argued on the basis of extensive neurologic, psychophysical, electrophysiological and behavioral evidence that the first pass of afferent activity through the striate cortex (V1) may at times—if not always—be insufficient to engender a conscious visual experience. Rather it appears that such experiences cannot occur at least until V1—or in the case of the somatosensory system (Kulics, 1982; Cauller, 1995) S1—has been reactivated by recurrent activity from higher cortical areas. Such back-projecting pathways link virtually every ‘higher’ occipitotemporal and occipitoparietal visual cortical area with its immediately antecedent ‘lower’ projection field (Pandya and Yeterian, 1985; Felleman and Van Essen, 1991) (Fig. 1). The reciprocal connections form a recursive neural network (RNN). Models based on RNNs have been proposed to account for a wide range of observations [for a review, see Pollen (Pollen, 1999)], but the present attempt to develop a systematic framework to delineate the function of recurrent loops as a single entity versus that of specific local loops in phenomenal visual experience appears to be novel.
Evidence that activation of a RNN is necessary for the emergence of phenomenal visual experience has been growing. Cowey and Walsh (Cowey and Walsh, 2000) applied transcranial magnetic stimulation (TMS) to the motion area (V5) of the much-studied hemianopic subject G.Y. who lacks a functionally intact V1 in one hemisphere. Such stimulation could elicit sensations of moving lights, by stimulation of V5 in his normal hemisphere but not from his hemianopic side. The analogous study with respect to colored phosphenes has not yet been reported presumably because the ‘color areas’ in the human brain lie ventrally and are not readily accessible to TMS. Even so, these results on motion strongly suggest the necessity of an intact V1—even apart from its obligatory role in projecting the outcome of its computations on geniculocalcarine afferent activity to extrastriate cortices—for at least for some visual experiences.
The case of G.Y. cited above is, of course, only a single report, and it cannot be excluded that the reported negative findings may be subject-specific or stimulation/parameter-specific rather than general. Indeed, certain types of motion can be experienced in striate-blinded patients on the basis of subcortical pathways that project directly to MT/V5 (Zeki and ffytche, 1998). [See also Pollen (Pollen, 1999) and Stoerig (Stoerig, 2001) for extended discussion of reports of other phenomenal experience referable to hemianopic fields.]
TMS produces its effects as a consequence of excitation and/or disruption of the cortical area(s) stimulated. Workers in this field acknowledge the technical limitations of TMS; namely, that they cannot yet ascertain precisely the depth of brain stimulation nor its spatial resolution. Nor can they as yet determine which neural elements are most sensitive to stimulation nor even whether the effects of stimulation are attributable to activity at the site of the stimulus or at more distant sites to which activity has spread (Pascual-Leone et al., 2000). Even so, these workers believe that the technique is reliable at the level of establishing relationships between behavior and excitation or disruption of cortical activity following TMS and particularly so with time-dependent perceptual alterations as will be considered below.
Thus, in a subsequent study to that of Cowey and Walsh (Cowey and Walsh, 2000), Pascual-Leone and Walsh (Pascual-Leone and Walsh, 2001) showed that TMS over V5 in normally sighted subjects induced similar experiences of moving lights that were extinguished when V1 was inactivated by stimulation from a second coil. This extinction occurred only at delays selected to block recurrent activation of V1. These results too are consistent with work (Corthout et al., 2000) suggesting that inactivation of V1 by TMS after the initial afferent volleys have traversed V1 is also able to block certain visual discriminations and experience.
However, these TMS studies cannot distinguish whether such results are dependent upon the integrity of the entire recursive loop back to V1, upon more local feedback from V5 to V1 and/or to V2 or that a reactivated V1 is essential for perception to provide concurrent activation of the parietal lobe and other areas, in addition to the activation of afferent pathways by the direct stimulation of V5. [In referring to any given cortical area, we do not exclude any essential interactions between that area and its subcortical interdependent projection zones (Llinas et al., 1998).]
Moreover, each successive cortical level adds further contingencies on neural representations for object identity and localization with respect to eye, head or limb position, motivational state, remembrances of previous encounters with such visual targets and the range of possibly intended movements. Thus, as yet it has not been possible to determine whether visual experience depends upon some minimally defined long chain of recursive loop in some distributed sense, by which is meant that the percept would cease to exist following an interruption of the chain of loops at any cortical level or alternatively only at specific cortical levels. The specific levels of most immediate interest are those that express an explicit neural representation (Crick and Koch, 1995, 1998) or receive its output.
Crick and Koch (Crick and Koch, 1995) suggested the term ‘explicit representation’ to denote some stimulus-selective minimal number of neurons, together with their projective fields, that coarsely encode some aspect of the visual scene. The targets of afferent and recurrent activity from such projective fields have not yet been fully specified and are a subject of the present inquiry. Crick and Koch (Crick and Koch, 1995) further suggest that a person is not necessarily aware of all such explicit representations and they (Crick and Koch, 1998) employ the more inclusive term ‘the neural correlate of consciousness’ or NCC to specify that spatiotemporal pattern of neuronal activity that corresponds to a conscious experience.
The cores of such explicit representations, which complete the computation required to specify specific sensory attributes, presumably correspond to those most central cortical areas which when damaged eliminate selectively that particular visual experience. For example, destruction of a cortical area beyond V1 and V2 selectively eliminates human color perception (Damasio et al., 1980; Zeki, 1990). Such modularity of visual function (Zeki, 1997, 1998) applies to motion (V5) as well as color. The precise boundaries for and identity of the color area(s) in man are controversial (Hadjikani et al., 1998; Zeki et al., 1998) and we will simply refer to the color area(s) in humans as V4/V8. In macaques the posterior inferotemporal cortices appear essential for color vision (Heywood et al., 1998). Cortical area V5/MT is essential for at least one type of motion perception (Newsome and Paré, 1988). Similarly, there is evidence that at least low level achromatic static visual cues for orientation and spatial frequency are explicitly represented in V1 (Pollen, 1999; Morland et al., 1999; Paradiso, 2002). Explicit representations for curvature may not be extensively elaborated until V4 (Gallant et al., 1993; Wilson et al., 1997; Pasupathy and Connor, 1999; Pollen et al., 2002).
However, it remains uncertain as to whether perceptual experiences are engendered within the core of such explicit neuronal representations, within the linked network of the core and its projective fields, within even more spatiotemporally extensive recursive networks, or—hypothetically, at least—only within, or together with, more anterior cortical areas, i.e. executive spaces, that receive direct projections from such explicit representations (Crick and Koch, 1995).
Other work suggests that perceptual experience requires interactions between neural representations of the sensory ‘image’ and representations for the sense of self (Damasio, 1999). This latter model does not exclude the possibility that the emergence of phenomenal experience may depend upon multilevel recursive processing for sensory representations as well as recursive interactions between such representations and representations for the sense of self.
Can Visual Experience Persist in the Absence of Recurrent Activity to V1?
We now inquire as to whether visual experience would persist if afferent information flow through V1 to all higher cortical areas remains intact but all recurrent activity to a sector of V2 is selectively blocked (Fig. 1). The experimental paradigm proposed here differs from that of Cowey and Walsh (Cowey and Walsh, 2000) who evoked the experience of motion by stimulation of V5 only in the normal hemisphere of an hemianopic subject because the experience of motion—as for any percept—may require both ‘what’ and ‘where’ information which may not be achievable when V5 is activated in the absence of an intact V1, hence precluding even first pass afferent activity from reaching cortical areas processing ‘where’ information.
The region of the visual field corresponding to the sector to be blocked can be predetermined by means of standard microelectrode recording or fMRI techniques. Visual stimuli specified for static achromatic brightness, motion and color, the latter under equiluminant conditions, can then be limited to the ‘recursively blocked’ visual field with the non-blockaded fields serving as controls.
The persistence of conscious visual experience for luminance referable to the recursively blocked visual field would refute models that suggest that recursive processing back to V1 is essential at the very least for the perception of static achromatic visual stimuli (Stoerig and Cowey, 1995; Pollen, 1999; Lamme, 2001; Stoerig, 2001). However, the persistence of conscious perception for motion and/or color would not distinguish between purely hierarchical ‘afferent only’ and recursive models at the level of V1 because both classes of models would accept that explicit representations for motion and color are elaborated beyond V1.
Moreover, if blockade of recurrent activity to V1 abolishes conscious visual perception for one or another type of percept, the interpretation of such a result is rather complex and we cannot determine without additional testing whether activity within V1 is an essential condition for conscious visual perception. For example, suppose the elimination of all recurrent activity simply reduced the contrast gain of V1 neurons so much that afferent activity could no longer drive V1 neurons above their excitatory threshold. In this case, adequate electrical stimulation of V1 as will be defined later—or even of a V1 with all excitatory synaptic intracortical activity silenced—could then be tested. The evocation of a conscious visual experience under these conditions would then strongly support an ‘afferent only’ model.
However, a failure of adequate electrical stimulation of a recursively blocked V1 to generate a conscious visual experience would provide strong evidence against an ‘afferent only’ model and strongly support recursive models. If this latter result were found, then one conclusion might be that it is not simply the strength of activity within V1 that is essential for conscious visual perception but rather the temporal relationship of such activity within V1 to that within still higher cortical areas as all relevant areas are ‘gated’ or temporally related by the recurrent projections.
The same approach used for V1 can be applied ad seriatum to determine whether higher cortical areas (Fig. 1), such as V5/MT and V4/V8, which we assume to express explicit representations for motion and color, respectively, nevertheless require recurrent processing from still higher cortical levels or alternatively participate in conscious visual perception only within the context of an ‘afferent only’ model.
Required Methods to Test the Effects of Selective Blockade of Recurrent Projections
Selective blockade of recurrent activity as proposed above would be easy if the afferent and recurrent excitatory projections utilized entirely different sets of receptors. We could then present specific visual stimuli and determine whether an animal experiences color, motion or achromatic brightness when the recurrent projections to the relevant cortical areas, V4/V8, V5 and V1/V2 respectively, are blocked. However, there is no evidence for such differential receptor selectivity.
On the contrary, afferent excitation from the retina is projected onto ionotropic glutamate receptors of lateral geniculate nucleus (LGN) neurons, whereas the recurrent corticogeniculate projections terminate on both ionotropic and metabotropic receptors of LGN projection neurons (McCormick and Von Krosigk, 1992; Guillery and Sherman, 2002). This particular recurrent pathway multiplicatively enhances the contrast gain of LGN neurons for stimuli tested over the classical receptive field (Przybyszewski et al., 2000). These results on contrast gain control are consistent with the results of Hupé et al. (Hupé et al. 1998) that cortical feedback enhances discrimination between figure and background by V1, V2 and V3 neurons in part by facilitating responses to stimuli within the classical receptive field. Whether or not this pattern holds for other corticocortical recurrent projections affecting the control of contrast gain, it is doubtful that one can locally block recurrent ionotropic pathways without concurrently blocking ionotropic afferent pathways. (However, a reviewer suggests that there is a distinct possibility that drugs that target different compositions of NMDA and/or non-NMDA synaptic receptor complexes may one day enable us to block selectively specific glutaminergic feed-forward or feed-back pathways.)
It would even now be possible to induce local blockade of only metabotropic receptors and determine if the relevant visual experiences to appropriate visual stimuli persist when metabotropic recurrent activation is suppressed. If such blockade eliminates phenomenal experience, the result would provide further opportunities to search for the sources of such recurrent activity conveying selective metabotropic activation and to probe the significance of the temporal constraints, i.e. delayed activation of metabotropic pathways (Guillery and Sherman, 2002).
However, if perception persists under such blockade, then, as noted above, one cannot employ the same paradigm to also block ionotropic receptors because afferent activity from visual stimulation would then be silenced as well. However, one could make multiple local injections of ‘cocktails’ of both ionotropic and metabotropic glutamate blockers in such target areas as either V1, V5 or V4/V8. Such blockade, if complete and if all afferent excitation is mediated by glutaminergic activation, would eliminate the visual experiences of luminance, motion and color, respectively, for visual stimuli presented to ‘blocked’ regions of the receptive field, but the neurons and their axons within such regions would still remain excitable by electrical stimulation.
Fortunately, electrical stimulation over both striate and extrastriate occipital areas gives rise to circumscribed well localized visual experiences called ‘phosphenes’ that may be experienced as white, colored or moving depending upon the cortical site stimulated (Penfield and Rasmussen, 1952). [See Grüsser and Landis (Grüsser and Landis, 1991) for review of earlier and later work with respect to both similarities and differences in descriptions of phosphenes evoked by extrastriate versus striate cortical stimulation.] Such visual experiences may be evoked by either surface (Brindley and Lewin, 1968; Dobelle and Miladejovsky, 1974) or intracortical (Bak et al., 1990) stimulation and over a very wide range of stimulus parameters (Pollen, 1975) once some threshold has been reached.
Moreover, such phosphenes can be elicited from cortical stimulation after the visual cortex has been cut off from geniculocortical activation (Krause and Schum, 1931; Brindley and Lewin, 1968; Dobelle and Miladejovsky, 1974). Hence, neither baseline afferent input to visual cortical neurons from the LGN nor recurrent activity back to this nucleus is required for the production of cortically generated phosphenes.
Thus, it should be possible to electrically stimulate each locally inactivated area either cortically or intracortically (Fig. 1) and determine whether any visual experience of phosphenes persists either for luminance, motion and color when the corresponding electrically evoked activity projects forward from these areas when their recurrent activation is blocked. In ‘control’ studies fMRI would have to be employed to determine the pattern of regional activation associated with the generation of electrically evoked phosphenes as distinguished from subthreshold activation. (The stimulating electrodes and connecting leads would need to be fabricated using fMRI compatible materials.)
Of course, even if glutamate were the only neurotransmitter involved in feed forward cortico-cortical activation, there is no evidence that all recurrent pathways utilize glutamate as neurotransmitter. On the contrary, various recurrent pathways subserving various aspects of selective attention employ acetylcholine, noradrenaline or dopamine as neurotransmitter (Fossella et al., 2002; Fan et al., 2002).
The same technique would then have to be employed to assure that multiple local injections of cocktails of glutamate blockers (Miller et al., 1989)—or alternatively appropriate anti-sense expression constructs (Davidkova et al., 1998; Liu et al., 2000) against either glutamate receptors or the synthesis of glutamate—plus cholinergic, noradrenergic, and/or dopaminergic blockers as applicable for the area under study have suppressed all afferent and recurrent activity as well as behavioral response to selective visual stimuli to the respective test area.
In studies to inactivate excitatory synaptic activity within a sector of a given cortical area, the GABA agonist muscimol might be used. This agent has inactivated local cortical areas (Reiter and Stryker, 1988) in anesthetized animals, but effective methods of neuronal block would need to be adapted for alert animals.
Crick and Koch (Crick and Koch, 1998) have previously considered the question as to whether feedback pathways are essential for normal visual consciousness and suggested the importance of so testing by selectively inactivating recurrent pathways both singly and collectively. They suggest that new methods in molecular biology may, in time, make such testing possible. But perhaps, until then, the existing methods described above coupled with either visual or electrical stimulation will suffice.
Moreover, such animals would have to be trained according to the paradigms of Cowey and Stoerig (Cowey and Stoerig, 1995) and Stoerig and Barth (Stoerig and Barth, 2001) to distinguish whether any stimulus trains that evoke behavioral responses are representative of phenomenal visual experiences rather than as examples of non-phenomenal blind-sighted (Weiskrantz, 1995a,b) responses. Clearly, the validity of assuring these behavioral distinctions is the sine qua non of the proposed studies.
The procedures cited above to distinguish responses based on implicit visual information processing as occurs in blindsight from phenomenal experiences are quite lengthy and cannot easily be completed during brief acute interventions. On the other hand, if chronic interventions are required, the experimenter faces the problems of neural plasticity and adaptation. For example, the more selective the intervention, the more likely will the brain, over time, compensate for its effects.
The same reviewer who recommended inclusion of the above caveats also suggested that if the selective blockades discussed above were rapid and fully reversible, then the proposed studies could, in principle, be carried out in human patients undergoing neurosurgical procedures. This could be done under either local anesthesia or after any preliminary general anesthesia had worn off. In such cases, immediate reports about visual experience would be feasible. If there are neurosurgical opportunities during which time such studies can be carried out quickly, safely and, above all, ethically and without discomfort, then the resolution of the technical issues considered here would be vastly accelerated.
Finally, if an animal reported phenomenal experience from stimulation of V1 after inactivation of recurrent projections to this cortex, then ‘matching to sample’ tests would be needed to determine whether the electrically induced phosphenes were interpreted as white or colored. The latter is an important qualification because, at least in theory, the experience of low level luminance cues might well depend upon recursive activity back to V1 whereas the experience of poorly spatially bounded swatches of color might not.
Further Interpretation of Outcomes
The combination of methods described should permit the determination as to whether conscious visual perception depends upon recursive neural networks to V1 or only or additionally in the case of color and motion for recursive activity back to V4/V8 and V5/MT, respectively, or alternatively, whether such perception requires ‘afferent only’ processing. These various outcomes can be further analyzed.
For example, suppose that the results suggest the necessity of recursive processing for the emergence of visual experience. The next task of determining which particular loop(s) is or are essential for phenomenal vision is difficult because there are multiple recurrent projections to each visual cortical area (Fig. 1). Indeed, at least seven distinct cortical areas project back even to V1 (Pandya and Yeterian, 1985; Felleman and Van Essen, 1991).
Moreover, such recurrent pathways are neither anatomically nor functionally homogeneous. Some pathways are direct, like those from V2 to V1 (Rockland and Virga, 1989), or from temporal areas directly to certain occipital cortical areas (Rockland and Van Hoesen, 1994); see also Salin and Bullier (Salin and Bullier, 1995). Others are indirect and comprise a series of short connections within the temporal lobe (Rockland and Drash, 1996). Still others, like those emanating from the amygdala, display a ‘canopy’ arrangement with fibers projecting back to multiple visual areas (Amaral and Price, 1984).
Functionally, some systems must be required to express selective attention back to V1 as well as extrastriate areas (Somers et al., 1999), others to modulate activity even as early as V1 as a function of gaze direction and distance (Rosenbluth and Allman, 2002), others to iteratively achieve figure-ground segregation and object recognition (Grossberg, 1994; Mumford, 1994), others to express the effects of emotional state upon sensory processing and still others to actively search the memory and generate conceptual constructs that may initiate action. If the sources of these recurrent pathways can be identified and selectively blocked without interfering with other loops, then there may be an opportunity to identify some minimal set of loops essential for phenomenal experience and to determine how they relate to other brain regions and representations.
On the other hand, if only afferent processing is required beyond the level of extrastriate explicit representations, then one can further inquire as to whether the phenomenal visual experience is a property of neuronal interaction within the area expressing the explicit representation or only on the basis of the processing of that area’s axonal output within more anterior cortical areas. The distinguishing test would be to inactivate all excitatory activity within the area in question and adequately stimulate an axonal output to higher cortical areas. Failure to elicit a conscious visual experience would suggest that there was something unique for such experience expressed by activity within the stimulated cortical area, whereas a positive result, i.e. the report of the appropriate phenomenal experience, would suggest that it is only the interaction of such outputs within more anterior areas that generates a percept. However, in either case, the critical neurons or intracortical networks involved might be discovered using new methods for inactivating selective cells and local networks (Zemelman and Miesenböck, 2001; Lechner et al., 2002; Slimko et al., 2002).
Selective Attention, Recursive Neural Networks and Representations of the Sense of Self
The objective of this section is to examine the possibility that neural representations of the sense of self both receive the output of and selectively attend to explicit sensory representations utilizing recursive neural networks as a necessary condition for the emergence of conscious experience. Some related current issues on attention are first briefly reviewed.
Attention, which is conventionally taken to indicate the selection of one set of sensory inputs over others, can be driven by salient bottom-up (or stimulus-driven) or top-down (or goal-directed) processes that almost invariably interact (Egeth and Yantis, 1997). Attentional selection involves a multilevel elevation of activity for attended stimuli and a screening out of unwanted stimuli and can be selective for either spatial location or features of objects when their localization is not known in advance (Desimone and Duncan, 1995). Lamme (Lamme, 2003) suggests that attentional selection depends, at least in part, upon the convolution of the processing of current sensory inputs and both long- and short-term memory.
Treisman and Gelade (Treisman and Gelade, 1980) proposed that attention to spatial location is necessary to properly bind the diverse features of objects such as luminance, color and motion that may be explicitly represented in different cortical areas. Treisman and Kanwisher (Treisman and Kanwisher, 1998) accept that unattended objects may be implicitly registered, but maintain that attention is required to bind features, to represent three-dimensional structure, and to mediate awareness.
Even so, the relationship between attention and conscious visual perception remains controversial. Many workers agree that we are selectively aware of and can report on the content of that part of the visual field to which we selectively attend. Studies of change blindness (Rensink, 2000) support this view insofar as focused attention is necessary to detect large and sudden changes in visual displays under conditions when such changes cannot be picked up by low level motion detectors.
This view is challenged by Lamme (Lamme, 2003) who suggests that a conscious experience of sudden change actually occurs but is immediately forgotten. However, since such changes fail on their own to trigger a voluntary report, it is not easy to confirm that a conscious experience transiently occurred.
Others claim that little or no top-down selective attention is needed for conscious experience outside the focus of attention (Braun and Sagi, 1990; Lee et al., 1999). However, it seems difficult to exclude entirely the possibility that bottom-up stimuli capture at least some minimal attentional resources. For us, the critical issue is not simply some meager improvement in the quality of the peripherally attended sensory representation as a consequence of such minimal attention but rather the establishment of a neural representation of the more central agency that expressed the attention and its spatial relationship to the unified perceptual process soon to be discussed below.
Thus, it remains possible that selective attention, as expressed by at least one type of recurrent pathway, is an essential requirement for visual experience at least within the focus of attention. Thus, in one sense, the proposed experiments on blocking recursive pathways may be equivalent to testing whether one or another of the sources of top-down selective attention (Fan et al., 2002; Fossella et al., 2002) to cortical areas expressing explicit representations is a necessary condition for the emergence of phenomenal experience.
Crick and Koch (Crick and Koch, 2003) suggest, in general terms, that ‘the front of the brain is ‘looking at’ the sensory systems, most of which are at the back of the brain’. Whether such ‘looking at’ implies selective top-down attention or somewhat automatic bottom-up processing should become resolvable within the context of the same issues considered here.
Stoerig and Cowey (Stoerig and Cowey, 1995) emphasized the need of all nervous systems to distinguish self from non-self. Damasio (Damasio, 1999) has extended this concept to suggest that percepts do not arise until the neural representation of a sensory image, whether exteroceptive or interoceptive, has been interpreted by or interacted with the neural representation for the sense of self. (For example, in the dramatic case of excruciating dental pain, Damasio doubts that a subject can experience a toothache unless his neural representation for the sense of self in some way ‘knows’ that it is his tooth that is aching.)
In my view, the most compelling justification for Damasio’s proposal is its provision of a neural representation for a sense of ownership of a sensory experience analogous to the representations of the sense of authorship for a volitional act (Wegner, 2002). Furthermore, Damasio provides some experimental support for his hypothesis. For example, Damasio (Damasio, 1999) describes a patient who lapsed into akinetic mutism—a condition characterized by inability to move or speak even when electrographically awake—as a consequence of bilateral injury to the anterior cingulate gyri. The patient seemed unable to experience any sustained conscious sensation while in this state nor to recall any sensory experiences related to this state after recovery. Damasio also cites patients in the advanced stages of Alzheimer’s disease who appear to undergo a concurrent dissolution of sensory experience and the sense of self. Whereas alternate explanations of these clinical findings may be possible, Damasio’s hypothesis nevertheless provides a promising rationale for probing relationships between explicit sensory neuronal representations and those subserving the sense of self, and for determining whether and how such interactions may apply to the emergence of conscious visual perception.
The parietal lobe, which provides spatial representations in various coordinate systems (Andersen et al., 1985; Colby and Goldberg, 1999) and binds features (Friedman-Hills et al., 1995), might well comprise one set of outposts of a multifaceted representation of the sense of self that selectively attends to the sensory world [see also Driver and Mattingley (Driver and Mattingley, 1998)]. Regions expressing such covert selective attention are widely distributed over the cerebral cortex with major epicenters within—but not limited to—the posterior parietal cortex, frontal eye fields and cingulate gyrus (Mesulam, 1999). It may not be coincidence that there is some regional overlap between Mesulam’s large-scale distributed network for covert spatial attention (Gitelman et al., 1999) and some of Damasio’s anatomic candidates for representing sense of self.
Whether the neural representations of the sources of selective attention are intermediaries between sensory representations and those for the sense of self or are an integral part of the latter is clearly a topic beyond the present discussion. [If it should turn out that conscious experience outside the focus of attention can occur in the absence of an attentional resources, it would remain possible that networks iteratively engaging figure-ground segregation and object recognition (Grossberg, 1994; Mumford, 1994) could still gain privileged access to and from neural representations for the sense of self.]
The failure of lesions of so many areas beyond the V2/3 border in humans to produce visual field defects (Horton and Hoyt, 1991) may well be a consequence of the diversity of recurrent connections back to early visual areas that can be addressed by so many regional sources (Mesulam, 1999) of selective attention even when one or another such pathway has been damaged (Fig. 2). Moreover, and in the same context, an essential difference between the purposive action attending phenomenal vision and the automata of ‘zombie vision’ (Koch and Crick, 2001) might relate to the expressability of covert selective attention in the former and its absence in the latter.
Selective attention might serve as one bridge between the seer and the seen. If both the neural representations of the seer and the representations of the seen should prove to be essential for any phenomenal visual experience then the necessary correlates for such experience might reside within some minimally defined recursive loop in contradistinction to the possibility that the loop endows some selective anatomic level with the emergence of phenomenal experience. Whatever the answer, the novel approaches suggested here may yet resolve this issue and open another line of attack to define the necessary conditions for and the neural correlates of phenomenal visual perception.
Is There a Privileged Instant for Perceptual Experience?
Should it turn out that recursive neural networks are a necessary condition for the emergence of visual experience whether or not selective attention is always an essential condition, we may push the question of necessary conditions one step further. As one example, the visual experiences of multistable figures and binocularly rivalrous stimuli alternate from one stable state to another without our experiencing any intermediary stages in the computations [for a review, see Pollen (Pollen, 1999)]. These results suggest that at the instant that a stable percept emerges, the activity in feed-forward and recurrent pathways has achieved some sort of steady state (Pollen, 1999) with a minima of residual activity (Mumford, 1992; Ullman, 1995) at each level between the two projection systems. If some sort of consensus or complementarity (Ullman, 1995) between the two loops at each successive cortical level is also essential for other types of visual experience then we may ponder whether such complementarity imposes some particular temporal gating function across a series of cortical levels.
The Biological Relevance of Perceptual Experience
The emergence of qualia may have profound biological significance rather than being simply an epiphenomena. To survive, each organism must be able to distinguish immediate from past sensory events. To the extent that qualia may, at the very least, signify that a computation based upon the most recently attended sensory data has been completed, then such phenomenal experience may be a genuine marker in its own right for the content of such immediate experience that is then distinguishable from non-phenomenal ideation or recollections even of the recent past based upon working memory. If so, then the search for the distinction between the neural correlates of qualia and those for non-phenomenal ideation may promote both such identifications.
I am very grateful to Drs John Lisman, Deepak Pandya, Michael Paradiso, David Paydarfar, Andrzej Przybyszewski and Terry Sanger for their constructive criticism and helpful suggestions on presubmission drafts. I am also grateful to Drs Patrick Cavanagh and Ron Rensink for helping me respond to some questions raised by the reviewers. Finally, I am extremely grateful to the reviewers for their extremely helpful comments.
Address correspondence to Daniel A. Pollen, Department of Neurology, University of Massachusetts Medical School, Worcester, MA 01655, USA Email: email@example.com.