A key question in sensory perception is the role of experience in shaping the functional architecture of the sensory neural systems. Here we studied dependence on visual experience in shaping the most fundamental division of labor in vision, namely between the ventral “what” and the dorsal “where and how” processing streams. We scanned 11 fully congenitally blind (CB) and 9 sighted individuals performing location versus form identification tasks following brief training on a sensory substitution device used for artificial vision. We show that the dorsal/ventral visual pathway division of labor can be revealed in the adult CB when perceiving sounds that convey the relevant visual information. This suggests that the most important large-scale organization of the visual system into the 2 streams can develop even without any visual experience and can be attributed at least partially to innately determined constraints and later to cross-modal plasticity. These results support the view that the brain is organized into task-specific but sensory modality-independent operators.
Ever since the seminal work by Ungerleider and Mishkin (1982), it has been repeatedly shown that visual processing is carried out in 2 parallel pathways. The ventral occipitotemporal “what” pathway, or the “ventral stream,” has been linked with visual processing of form, object identity, and color. Its counterpart is considered to be the dorsal occipitoparietal “where/how” pathway, or the “dorsal stream,” which analyzes visuospatial information about object location and participates in visuomotor planning and visually guided movement (Goodale and Milner 1992; Goodale 2008).
This double dissociation between the processing of the 2 streams has been thoroughly validated by studies of localized lesions separately affecting visual object identity recognition (visual agnosia) and object visuomotor spatial manipulation (optic ataxia: for a review, see Goodale 2008) and numerous human imaging studies (e.g., Haxby et al. 1991; Shmuelof and Zohary 2005).The anatomical basis for this division has also been studied and shows a complex pattern of bottom-up connectivity, beginning with the primary visual cortex and creating 2 parallel (though not completely independent) anatomical connectivity streams, one of which leads dorsally, through the posterior parietal cortex toward the premotor cortex, thus creating a natural “path” toward preparation for motion and spatial processing, while the other leads through area V4, which is selective for the color and size of visual objects (Zeki 1983; Desimone and Schein 1987), to inferotemporal areas containing complex visual object representations (Desimone 1991; Tanaka 1997), and up to the prefrontal cortex (O'Scalaidhe et al. 1997). This hardwired bottom-up connectivity pattern creates a strong constraint toward the generation of these streams in the presence of normal visual input during development. However, the creation of 2 separate visual streams and their functional selectivities may not be so trivial in the absence of visual input, which deprives the streams of their natural input.
Thus, while years of specialized research have established visual cortex segregation into functional streams as one of most fundamental characteristics of the visual system, what remains unclear is the role of visual experience in shaping this functional architecture of the brain. Does this fundamental large-scale organizational principle depend on visual experience? Or are innately determined constraints sufficient for the emergence of the division of labor between the 2 streams, even in the absence of vision?
One way to study these questions is to investigate whether traces of the visual functional division of labor can be identified in fully CB individuals who have never had any visual experience, by using sensory substitution devices (SSDs). SSDs are noninvasive devices that provide visual information to the blind via their existing senses (mainly audition and touch, see Fig. 1A; Meijer 1992; Bach-y-Rita and Kercel 2003) which may thus provide a valuable test for the existence of critical (or sensitive) periods for developing this fundamental functional segregation between the 2 visual streams. If the CB show no such differentiation, then visual experience is necessary and stream formation may be restricted to the presence of visual input. Conversely, if there are indications of a functional “visual” segregation in the absence of visual experience, the streams may emerge at least partially from innate biases, connectivity patterns or functional differentiation of processing shape and location functions which are not necessarily intrinsically visual by nature, but reflect a task- or computational-selectivity that is independent of the sensory input modality (e.g., Reich et al. 2011) and may thus be flexible at later ages regardless of visual experience (see also Pascual-Leone and Hamilton 2001; Renier et al. 2005a, 2005b, 2010; Poirier, Collignon et al. 2006; Amedi et al. 2007; Beauchamp et al. 2007; Mahon et al. 2009, 2010; Ptito et al. 2009; Ricciardi et al. 2009; Matteau et al. 2010; Sani et al. 2010).
In addition to the important theoretical implications of such questions, answering them could also contribute to clinical rehabilitation of individuals suffering from peripheral blindness, who may regain some level of visual peripheral sensory input through novel cutting-edge medical approaches. Although these approaches can restore some visual information in the periphery, this does not guarantee that such information can be easily and adequately understood by the individual (e.g., Fine et al. 2003; Gregory 2003; Ostrovsky et al. 2006, 2009). Because the blind brain undergoes vast cross-modal plastic changes due to prolonged sensory deprivation (even to the extent of processing language and memory in the occipital cortex, especially in the ventral stream: Röder et al. 2001; Amedi et al. 2003; Bedny et al. 2011), it may have lost, or never developed, its ability to properly process vision and the functional neural architecture supporting it. Thus, it may be relevant also for clinical sight restoration to question whether the functional architecture of the visual cortex critically and exclusively depends on visual experience and whether congenital and longitudinal visual deprivation causes the visual streams to entirely lose their natural visual roles, which may prevent them from reverting to natural vision processing if peripheral input could be restored (and vice versa if residual stream specialization can be found).
Materials and Methods
Visual-To-Auditory Sensory Substitution
We used a visual-to-auditory SSD called “The vOICe” (Meijer 1992), which enables “seeing with sound” for highly trained users (seeing with sounds can also be achieved using other algorithms, e.g., PSVA see Renier and De Volder 2010). In a clinical or everyday setting, users wear a video camera connected to a computer and stereo headphones; the images are converted into “soundscapes” using a predictable algorithm, allowing them to listen to and then interpret the visual information coming from a digital video camera. Remarkably, proficient users are able to differentiate the shapes of different objects, identify the actual objects, and also locate them in space (Amedi et al. 2007; Auvray et al. 2007; Proulx et al. 2008). The functional basis of this visuoauditory transformation lies in spectrographic sound synthesis from any input image, which is then further perceptually enhanced through stereo panning and other techniques. Time and stereo panning constitute the horizontal axis in the sound representation of an image, tone frequency makes up the vertical axis, and loudness corresponds to pixel brightness.
The study included a total of 20 subjects, 9 normally sighted individuals (sighted controls [SCs]), and 11 blind individuals. Our group of legally blind subjects was relatively homogeneous in terms of blindness in that all of them were congenitally blind (CB) and 9 of the blind subjects did not have any form of light perception. The remaining 2 had faint light perception but they were unable to localize light or recognize any shape or form. The age range of the subjects was wide, from 18 to 60, all had normal hearing, and had no neurological or psychiatric conditions. For a full description of the subjects, causes of blindness, etc. see Supplementary Table 1. Sighted subjects had normal vision (corrective lenses permitted) and hearing. The Tel-Aviv Sourasky Medical Center Ethics Committee approved the experimental procedure, and written informed consent was obtained from each subject.
Training Procedures and Performance
All subjects had their first training session, which lasted between 1 and 1.5 h, on sensory substitution using the vOICe software immediately before the functional magnetic resonance imaging (fMRI) session reported here (i.e., the subjects were completely naïve to the principles of the vOICe before the training session). During the session, subjects were first taught the visual-to-auditory transformation rules and proceeded to practice the very simple shape and location perception of a standardized set of stimuli which is part of the training set of stimuli used in our laboratory to teach CB individuals to use the vOICe (including small lines, rectangles, and round objects presented at 4 possible locations on the screen; see Fig. 1B). Feedback on performance was given by showing the participants the sensory image that they had heard following each training trial, using vision for the sighted subjects, and haptic dimensional models of all the stimuli for the CB individuals. Critically, none of the stimuli delivered during training was repeated during the scan, in which they were introduced to completely new stimuli. Testing fully CB participants without any visual experience and using SSD enabled us to test the dependence of visual stream segregation on visual experience directly. Furthermore, given that such short training probably does not enable any long-term or extensive learning-induced plasticity (Pascual-Leone et al. 2005), this design also enabled us to isolate and research the baseline state of the visual system in CB in relation to form and location processing. In contrast to most previous studies using SSDs, we also collected the behavioral results from inside the scanner in order to ensure that subjects were deeply engaged in the shape and localization tasks (even if performance was not very high due to the brief training, prompted by our interest in the baseline—innate—state of the visual cortex prior to extensive training which may influence it). Performance was comparable (2-way analysis of variance [ANOVA], group effect, F = 0.65, P < 0.56) in the blind and sighted groups, which was a critical comparison in order to determine whether the blind also recruit shape and location centers and to allow comparison between groups. However, performance differed, on average, between the tasks, as the shape task was more difficult than the location task for both groups (2-way ANOVA, F = 297, P < 0.05, 47.5 ± 4.2% and 44.4 ± 4.4% for shape in the CB and SC, 83.5 ± 5.8% and 82.2 ± 4.3% for location in the CB and SC). No significant task (F = 0.62, P < 0.57), group (F = 18.1, P < 0.15), or interaction (F = 0.05, P < 0.82) effects were observed in a 2-way ANOVA for reaction time (10 ± 0.7 s and 10.1 ± 0.8 s for shape in the CB and SC, 9.9 ± 0.5 s and 10.2 ± 0.8 s for location in the CB and SC). It is worth noting that the activation pattern on the shape task in this study replicates to a large extent the pattern seen following 40 h of training in sighted subjects (Amedi et al. 2007), suggesting that subjects focused their attention on extracting shape.
Moreover, in order to address the potential influence of behavioral differences, we analyzed various subgroups that had both: 1) above chance performance in both tasks and 2) no significant difference between the 2 tasks. All the data from these subgroups were subjected to several converging analyses, including random-effect (RFX, see details below; Friston et al. 1999) analyses. The subgroups included a group of 5 briefly-trained participants with matched performance at the individual level (no more than 10% accuracy difference between the tasks; see details below), a group of 12 participants which included better-trained participants, and a critical group as regards our main research question; namely, a group of 7 CB participants subjected to an RFX analysis. First, we examined a subgroup of 5 briefly-trained subjects, 3 of whom were fully CB, who performed similarly on the 2 tasks (at an individual subject level) and exhibited higher than average performance on the shape task (student’s paired t-test t4 = 1.132, P < 0.29, 62 ± 5.4% and 72 ± 8.2% for shape and location, respectively) in more detail. All these individual subjects (Supplementary Fig. 3), as well as the group analysis of this subgroup (Fig. 3A), showed similar effects to the ones reported for the main groups, suggesting that the task-specific activation did not result from general difficulty biases. To further control for the effect of performance in a larger group of subjects, we scanned again as many subjects as possible from our original cohort (an average of 11 months after the original scan) after being further trained for 40 additional hours on various tasks and visual stimuli using the vOICe SSD, enabling them to achieve better task performance. We thus inspected a mixed group of 12 participants (7 blind, 5 sighted) at both training levels, in which performance was matched between the tasks (2-way ANOVA, no effect of task—F = 3.5, P < 0.31, group—F = 1.5, P < 0.44 or interaction—F = 1.16, P < 0.29) and showed that they also manifested the double task dissociation between the visual streams, at the group level (using RFX analysis) (Fig. 3B), in a group of blind participants only (Fig. 4; n = 7, no behavioral task effect, F = 3.75, P < 0.07), and in all the individual subjects (Supplementary Fig. 3), regardless of their training.
General Experiment Design
Twenty novel simple visual stimuli were created using 2 different shape categories: round and angular shapes (e.g., a circle and a square; see Fig. 1B) and 2 different locations (left and right). The stimuli also varied in their vertical location (up and down), but this factor was irrelevant to the tasks required, enabling generalization of shape and location to various locations within the “visual” field. The use of novel stimuli is very demanding but also further enabled us to inspect the neural correlates of the online computation of discerning shapes and locations, as opposed to possible memory effects (in themselves activating the occipital cortex of the blind; Röder et al. 2001; Amedi et al. 2003). During each trial, subjects were presented with an auditory instruction: either “shape” or “location,” which directed their attention to the task. They were then presented with a 1-s soundscape (SSD sound rendering of the visual stimulus) that was repeated 4 times (total presentation time—4 s), given 5 additional seconds to reconstruct the image in their mind and were then instructed, by an auditory cue, to respond using a response box. Subjects used a two-button response box to indicate the parameter specified (Is the shape round? Is it in the left of the picture?). Each stimulus was presented twice: once for the shape task and once for the location task (without feedback), in a pseudorandomized order, such that half the stimuli were first presented in the shape task and half in the location task. Therefore, the location condition and shape condition each repeated a total of 20 times, once for each of the different stimuli. Half of the stimuli (and thus also the trials) were round; similarly, half of the stimuli were on the right side of the visual field. Subjects were not allowed to see or touch the pictures that generated the vOICe stimuli used in the fMRI testing, and none of the stimuli presented during training were used for the scan. Sighted subjects wore blindfolds and had their eyes shut for the duration of the scan to control for the lack of visual information during the scan between the groups.
fMRI Recording Parameters
The blood oxygen level–dependent (BOLD) fMRI measurements were performed in a whole-body 3-T GE scanner. The pulse sequence used was the gradient-echo echo planar imaging sequence. We used 29 slices of 4 mm in thickness. The data in-plane matrix size was 64 × 64, field of view (FOV) 20 cm × 20 cm, time to repetition (TR) = 1500 ms, flip angle = 70°, and time to echo (TE) = 30 ms. Each experiment had 320 data points with 2 repetitions (runs), whose order of presentation was controlled for across individual subjects. The first 5 images (during the first baseline rest condition) were excluded from the analysis because of non-steady state magnetization. Separate 3D recordings were used for coregistration and surface reconstruction. High-resolution 3D anatomical volumes were collected using T1-weighted images using a 3D turbo field echo T1-weighted sequence (equivalent to magnetization prepared rapid gradient echo). Typical parameters were: FOV 23 cm (RL) × 23 cm (VD) × 17 cm (AP); fold over—axis: RL, data matrix: 160 × 160 × 144 zero filled to 256 in all directions (approximately 1-mm isovoxel native data), TR/TE = 9/6 ms, flip angle = 8°.
Data analysis was performed using the Brain Voyager QX 1.10 software package (Brain Innovation, Maastricht, Netherlands) using standard preprocessing procedures. fMRI data preprocessing included head motion correction, slice scan time correction, and high-pass filtering (cutoff frequency: 3 cycles/scan) using temporal smoothing in the frequency domain to remove drifts and to improve the signal to noise ratio. No data included in the study exceeded motion of 2 mm in any given axis or had spike-like motion of more than 1 mm in any direction. Functional and anatomical data sets for each subject were aligned and fit to standardized Talairach space (Talairach and Tournoux 1988). Single-subject data were spatially smoothed with a minimal 3 dimensional 6-mm half-width Gaussian (2 functional voxels) in order to reduce intersubject anatomical variability and then grouped using a general linear model (GLM) in a hierarchical random effects analysis (RFX; Friston et al. 1999; see for instance implementation in Amedi et al. 2007). In addition to the main contrast, all GLM contrasts reported in this study also included a conjunction (or a mask) of the comparison of the main condition to baseline, to verify that only positive BOLD for the main predictor would be included in the analysis (e.g., in a contrast of location vs. shape, location was also contrasted with baseline, and the 2 contrasts were analyzed in conjunction, thus only voxels that showed significant RFX positive BOLD to location and also significantly higher activation to location vs. shape were highlighted in the maps). This also precluded misleading comparisons including the default mode network (DMN; Raichle et al. 2001; Raichle and Mintun 2006; Raichle and Snyder 2007) in areas showing deactivation to one condition and a larger deactivation to another (e.g., to preclude an area showing for instance deactivation to location and significantly more deactivation to shape from appearing on the map, to further demonstrate the dissociation of our findings from the DMN, see Supplementary Fig. 1). In order to directly compare the effects of blindness and task across the entire data set, a 2-way ANOVA was computed (Fig. 2), taking into account all the sighted subjects (n = 9), and, in order to control for group size and complete blindness, the 9 fully CB who did not have any form of light perception (although the remaining 2 CB had merely faint light perception and no ability to recognize visual shapes). Post hoc contrasts were further computed from the ANOVA design and presented within the statistically significant main effect statistical parametric maps. Similar ANOVA analyses were also computed for the performance-matched subgroups (Figs 3 and 4). The minimum significance level of all results presented in the study was set to P < 0.05 taking into account the probability of a false detection for any given cluster (Forman et al. 1995), thus correcting for multiple comparisons. This was done based on the Forman et al. (1995) Monte Carlo simulation approach, extended to 3D data sets using the threshold size plug-in Brain Voyager QX. We also conducted a complementary independent regions of interest (ROIs) analysis. ROIs (Fig. 1E) were derived from the occipital peaks for the shape versus location and location versus shape contrasts (in conjunction with positive activation for the main condition, i.e., shape and location accordingly) in the first run of the experiment on the entire (n = 20) group. Additionally, we sampled the peaks of activation in the ANOVA interaction statistical parametric map (sampled from the interaction effect on the first run of the experiment for both groups) in each of the groups (Fig. 2D). Activation peak beta and selectivity indices (contrast T values, the T value for the shape task minus that of the location task) were sampled from these ROIs at the group level of activation on the second run (repetition) of the experiment, thus making the ROI definition and parameter sampling independent of each other. Separate 3D recordings were used for surface reconstruction. Anatomical cortical reconstruction procedures included the segmentation of the white matter using a grow-region function embedded in the Brain Voyager QX 1.9.10 software package (Brain Innovation). The Talairach normalized cortical surface was then inflated, and the obtained activation maps were superimposed onto it.
To test for the putative dorsal–ventral division of labor dissociation, we tested what happens when a group of CB and blindfolded sighted individuals are trained to extract shape and location information using a unique visual-to-auditory sensory substitution algorithm utilized for visual rehabilitation by embedding the visual information in sounds (soundscapes). We analyzed the data in our experiment using several complementary methods of analysis. First, we tested for task preference in each group independently using GLM RFX statistical parametric maps (Fig. 1C,D) and independent ROI analysis (Fig. 1E). Additionally, we conducted a 2-way ANOVA analysis that directly tested the TASK, GROUP, and interaction effects in the entire data set (Fig. 2). We also examined several subgroups of subjects with matched performance between the 2 tasks (Figs 3 and 4) including the critical group of fully CB individuals with controlled performance using RFX analysis (Fig. 4).
First, we examined task selectivity separately for the 2 participant groups (Fig. 1C,D, see map peaks coordinates in Supplementary Table 2). We found a clear differentiation between the dorsal and ventral pathways for the processing of location and shape of soundscapes, respectively. In addition to the network of multisensory areas (such as the intraparietal sulcus and inferior frontal sulcus; Amedi et al. 2005, 2007; Lacey and Campbell 2006; Naumer et al. 2008; Striem-Amit, Dakwar, et al. 2011), shape processing activated the ventral occipital inferior temporal sulcus (ITS) located in the midst of the ventral visual stream. In contrast, the localization task preferentially activated a network involving both auditory regions (such as the supramarginal gyrus in the inferior parietal lobe; Weeks et al. 1999) as well as the precuneus, corresponding to Brodmann area 7, a higher order part of the visual dorsal stream. Therefore, despite the identical perceptual auditory stimulation in the 2 tasks, we observed differential recruitment of the ventral versus dorsal stream for shape and location processing. Most critically, we found a division of labor for form and location in the visual system of CB (Fig. 1C). Thus, we find visual stream/task-specific selectivity in the CB in the same experiment and show that this division of labor is common to both groups (Fig. 1C,D). One clear difference in the magnitude and distribution between the 2 groups was found in the ventral visual stream. Whereas the SC group (Fig. 1C) showed a weaker soundscape activation limited to higher order object-related areas in the inferior temporal ventral cortex (significant in the ROI analysis; see Fig. 1E but not significant enough to pass the strict multiple comparison correction applied across the entire volume of the brain), the CB group (Fig. 1D) showed robust and vast ventral visual cortex preference for shape conveyed by sounds, demonstrating task-selective cross-modal plasticity (also see below, a direct examination of this effect using the interaction between GROUP and TASK effects in ANOVA, Fig. 2). This activation was even found within the early ventral stream areas, which corresponds to ventral retinotopic areas reaching as far as the calcarine sulcus (V1). Although activation was stronger in the precuneus of the CB in the location task, no additional activation was found in this group in early areas corresponding to dorsal retinotopic areas or V1.
To further investigate task preference of the ventral and dorsal visual cortex through an additional independent method, we defined ROIs from the peaks of selective activation of the first run of the experiment in the entire group (n = 20, combined SC and CB group) for shape and location processing (Talairach coordinates: ITS LH −43, −60, −18, Precuneus RH 11, −52, 51) and examined the activation generated in the second run of the experiment in each group separately in these ROIs (Fig. 1E). The beta values of both the ITS and precuneus regions (ventral and dorsal stream peaks, respectively) showed a highly significant difference (Fig. 1E; at least P < 0.0005 for all contrasts) between the 2 tasks in both groups. Interestingly, activations in both peaks were higher in the CB group (P < 0.05).
To critically and directly investigate the separate effects and interaction between the task preference of the ventral and dorsal visual cortex and the group (with and without visual experience), we computed a two-way ANOVA (see Supplementary Fig. 2), with a TASK factor (shape and location) and a GROUP factor (SC and CB). In this analysis, we included all 9 SC participants and the 9 fully CB subjects (the 2 other CB were completely blind with minimal light perception and thus were omitted to fully control for both group size and for absolute blindness; results are similar when including these 2 blind subjects, data not shown). Performance in both groups was comparable (no significant group effect, F = 0.02, P < 0.92). Post hoc contrasts were computed within the significant statistical parametric maps of the main effects analysis. The TASK effect showed, as seen in the GLM analysis, a significant effect in the visual cortex (see Supplementary Fig. 2A). The post hoc TASK contrasts (Fig. 2A; see map peaks for the ANOVA in Supplementary Table 3) replicated the stream segregation seen between shape preference in the ventral stream and location preference in the precuneus in the dorsal stream, which were independent of the GROUP effect. Furthermore, this analysis suggests there was an additional region within the more posterior dorsal stream (Fig. 2A), in the bilateral lateral-occipito-parietal cortex (in the middle temporal gyrus/sulcus) showing preference for the location task across the groups. The GROUP effect indicated a main effect of long-term blindness in the posterior occipital cortex (see Supplementary Fig. 2B), in that an increased involvement of the posterior occipital cortex for processing soundscapes was identified in the CB relative to the SC (Fig. 2B). This increase was accompanied by a decrease in the activation of auditory cortices, which was similar to previously reported decreases in auditory cortex activation in CB for auditory localization (Weeks et al. 2000). Furthermore, the interaction of the 2 main effects (TASK × GROUP; Supplementary Fig. 2) was significant, indicating different task selectivity between the 2 groups. Post hoc contrasts revealed that this effect stems from differential preference for shape between the groups (Fig. 2C, other contrasts showed no significant activation). The posterior ventral cortex showed greater preference for shape in CB, even at a highly conservative threshold (P < 0.001 corrected for multiple comparisons; see Fig. 2C), which stretched all the way to the primary visual cortex (calcarine sulcus) at a more permissive yet significant threshold of P < 0.05 (corrected). In fact, areas more posterior to the inferior temporal cortex, in the retinotopic ventral posterior occipital cortex (ventral Brodmann area 19), showed a robust shape selective activation in the CB, in contrast to a significant deactivation in both tasks in the sighted (P < 0.05 for both groups; see GLM-beta values in Fig. 2D). These findings support the increased involvement of the ventral posterior occipital cortex in soundscape shape processing in the CB group alone (similarly, compare Fig. 1C and D).
Even though a general performance bias would not easily explain this complex task-specific stream-specific activation pattern or the similarity of the shape preference network to previous findings in highly trained sighted subjects who exhibited high performance (Amedi et al. 2007), the performance did differ between the tasks in both groups in favor of the easier localization task. We thus controlled for any general task performance biases and additionally inspected activation in several ways. We inspected several subgroups of participants that had higher and controlled performance across the tasks at both the single-subject level and the group level (including RFX analysis) and in independent ROI analyses for the peaks of activation derived from the main group effects.
Specifically, we first inspected a subgroup of 5 subjects (3 fully CB and 2 SC) who showed similar behavioral performance in both tasks at the individual level (student’s paired t-test, t4 = 1.132, P < 0.29 across the group, maximal difference of 10% performance in each subject), even after being very briefly trained. The results indicated a task-specific differentiation between visual streams in “all” single subjects in this subgroup (Supplementary Fig. 3; including 3 CB) and in the data pooled across them (Fig. 3A, fixed-effect ANOVA, map peaks are reported in Supplementary Table 4; for similar results using GLM analysis, see also Supplementary Fig. 4A), including a preference for the stream-matching task in the independent ROI analysis (Fig. 3A; P < 0.005), confirming that the ventral/dorsal division of labor could not stem from performance differences alone.
Moreover, to fully control for behavioral effects in a RFX analysis of a larger group, we further scanned participants who had trained for a longer period of time and achieved better and more matched performance between the 2 tasks (for details, see Materials and Methods). Both the whole group analysis (n = 12, random-effect ANOVA, Fig. 3B, see also map peaks in Supplementary Table 4; comparable GLM analysis in Supplementary Fig. 4B), the analysis of the blind group alone (n = 7, random-effect ANOVA; Fig. 4, map peaks in Supplementary Table 4; comparable GLM analysis in Supplementary Fig. 4C), the complementary independent ROI analyses (Figs 3B and 4), and the individual subject analysis level (of all the participants; Supplementary Fig. 3) confirmed that the stream dissociation does not result from performance differences. In all these types of independent analyses, for groups and individual subjects, we found a clear dissociation between the visual streams, regardless of task difficulty, including in RFX analyses in the CB. Therefore, the findings suggest that some aspects of large-scale dissociation between the ventral and dorsal streams are clearly independent of visual experience.
Our study shows a double dissociation between the distinct activation of areas anatomically consistent with part of what is known to be visual ventral and dorsal streams in response to shape and location tasks using soundscape stimuli derived from visual origin in the same experiment. This pattern was seen across several independent analyses, in both groups (Fig. 2A and Fig. 1C,D) and most importantly, in the CB group separately (Figs 1D and 4). All the results in both groups and most critically in the CB group remain identical as well in the matched performance subgroups and analyses (Figs 3 and 4, Supplementary Fig. 3). The main regions showing task-specific activation across groups were the inferior temporal cortex for shape in the ventral stream and the precuneus and middle temporal sulcus/gyrus (Figs 1 and 2A) for location in the dorsal stream (e.g., Martinkauppi et al. 2000; Sestieri et al. 2006). Furthermore, the CB group also showed additional extensive recruitment of the posterior ventral stream for the soundscape shape task in ventral Brodmann area 19 (Figs 1D and 2C,D, Supplementary Fig. 2). The most crucial aspect of our findings suggests that despite life-long blindness, lack of visual experience and the use of novel stimuli with short training, a large extent of the ventral visual cortex in CB can be recruited to process visual-from-auditory shapes, while at least part of the dorsal stream processes visual-from-auditory location information. Therefore, life-long existence without vision does not render the 2 visual streams completely unresponsive to their classical division of labor.
The activation observed in the ventral stream is consistent with a previous study in sighted (as well as one late blind and one CB; Amedi et al. 2007) which showed that LOtv, a tactile-visual shape area, is activated for shape information conveyed using SSD soundscapes. However, that study, similar to other studies in sighted subjects (Renier et al. 2005a, 2005b; Poirier et al. 2006, 2007), could not entirely avoid the visual imagery confound, which may have contributed to any reported visual cortex activation. Moreover, most studies used a combination of only highly trained proficient SSD-users as participants (following as many as 40 h of training; Amedi et al. 2007) and familiar, well-practiced stimuli (Renier et al. 2005a, 2005b; Amedi et al. 2007; Ptito et al. 2009; Matteau et al. 2010; although sometimes as part of a training paradigm, Arno et al. 2001; Ptito et al. 2005; Kim and Zatorre 2011). All these might complicate the interpretation and strength of previous results. For instance, the brains of proficient users may have already undergone significant plastic changes due to the extensive use of SSDs (see, e.g., increased occipital cortex activation following SSD training; Ptito et al. 2005), and the use of familiar stimuli could generate activation due to memory rather than shape processing in the occipital cortex of the blind subjects (Röder et al. 2001; Amedi et al. 2003). More critically, since most of these studies focused only on one individual task (and did not contrast, e.g., shape and motion or shape and location), they were unable to directly test the double-dissociation division of labor of the visual cortex.
By circumventing these possible confounds, our study is the first to show the segregation between the ventral and dorsal streams in CB using an SSD in the same subjects, the same experimental setup and using novel stimuli. Thus, we are able to demonstrate the selective activation of visual areas by auditory stimuli in the absence of any experience that could support visual imagery. Interestingly, the activation and even task selectivity of the visual streams were more robust in the blind group than in the sighted group (Fig. 1C–E and Fig. 2B,D, Supplementary Fig. 2), particularly in the posterior occipital cortex (Figs 1 and 2B, Supplementary Fig. 2). While previous studies have reported increased activation in the occipital cortex of the blind (as compared with sighted) for various nonvisual tasks (Ptito et al. 2005, 2009; Matteau et al. 2010; Renier et al. 2010); in our study, we found a more complex interaction between plasticity in the blind and specific task preference in the posterior ventral occipital cortex (Fig. 2C,D). In the anterior ventral ITS, we found activation in both groups (though significantly stronger for the blind) while in the posterior ventral stream in retinotopic areas (BA 19; Fig. 2) we observed robust activation and preference for shape in blind and significant deactivation in the sighted (Fig. 2D). This stream- and task-specific cross-modal plasticity effect shows that not only is the ventral stream still selective for shape, this preference is enhanced (as compared with the sighted) when the shape is encoded through sound. Both results argue against a visual imagery explanation as the main basis for the ventral activation for shapes of auditory inputs which represent visual entities, suggesting instead that cross-modal plasticity biases (Pascual-Leone and Hamilton 2001; Pascual-Leone et al. 2005) are a stronger factor in driving the visual ventral stream.
These findings have important theoretical implications, as they contribute further evidence supporting recent theories of brain organization which argue that the selectivity of the different functional cortical regions is not according to their input modality but rather according to task selectivity, which may be computed with various modalities (Amedi et al. 2001, 2007; Pascual-Leone and Hamilton 2001; Mahon and Caramazza 2009; Reich et al. 2011). The current results are consistent with several other studies demonstrating multisensory or task-dependent metamodal processing of specific brain areas within the visual system (e.g., LOtv; Amedi et al. 2002, 2007; James et al. 2002). Such studies showed recently that spatial processing of both simple auditory chords or vibrotactile stimulation selectively activate the middle occipital gyrus (MOG) of the blind (Renier et al. 2010; Collignon et al. 2011), suggesting a task-specific role for an additional unique area of the visual system. Our data are in line with this finding, as in addition to the robust task selectivity of the precuneus (which also showed multisensory properties in the sighted; Renier et al. 2009), we observed spatial selectivity in sensory substitution artificial vision input in an area in close proximity to the MOG, the posterior middle temporal sulcus/gyrus (Fig. 2A). Similarly, specificity for tool stimuli over other, nonmanipulable objects was observed in 2 regions of the parietal cortex of the blind (Mahon et al. 2010), activation for Braille reading was found in the “visual word form area” (Reich et al. 2011), activation for kinesthetically guided hand movements was found in primary somatosensory cortex independent of the visual experience of participants (Fiehler et al. 2009), and activation of the human MT region was found for nonvisual motion in the blind (Poirier et al. 2006; Beauchamp et al. 2007; Ricciardi et al. 2007; Ptito et al. 2009; Matteau et al. 2010; Sani et al. 2010). All these suggest that the brain might be comprised of flexible task-selective but modality-independent operators (Reich et al. 2011). Another recent study that is particularly relevant to the conclusions drawn here looked beyond area-specific computation and showed that the larger scale animate/inanimate organization within the high-order anterior ventral visual cortex is independent of vision (in a group of sighted and 3 CB individuals; Mahon et al. 2009). Mahon et al. (2009) concluded from their results that modality dependence is secondary as a hierarchical organizational factor to the object domain (e.g., living vs. non-living, also see a review of conceptual object categories; Mahon and Caramazza 2009) in the ventral visual cortex. Our findings extend such concepts of a-modal innately determined developmental constraints to the more fundamental organizational principle of the segregation between the 2 processing streams. In doing so, it extends the findings beyond visual object conceptual categories to postulating that the whole brain may be task specific but sensory modality independent, if the relevant computation and task can be achieved from the sensory input (even if this is not an ecological way to do so, i.e., via SSD).
In this respect, sensory substitution is an ideal tool to study task-dependent operations as it teases apart the effect of the modality from the computation or task in question and also makes it possible to study tasks using untrained, novel, “modalities” and stimuli. The functional recruitment in the brain of CB following such a short training period makes it highly improbable that they reflect any extensive plastic changes (Pascual-Leone et al. 2005). Instead, it suggests that the division of labor between the ventral and dorsal streams for form and location in the visual cortex must already be present and the short training presumably “revealed” these innate preferences under our special experimental conditions. The life-long use of these areas for visual input more than for information originating in other senses (along with the usefulness of vision to decipher shape) makes the streams appear as though they are only or mostly visual. This study suggests this is not the case and that cross-modal plasticity can still result in their activation for their original visual tasks.
What are the developmental endogenous, or innate, constraints that might contribute to such a sensory-independent task-selective organization in the CB? We speculate that 2 factors, which are not mutually exclusive, could have taken part. The first are intrinsic modality-independent preferences for a particular (different) type of content or computation in each brain area (in our case, in the dorsal and ventral regions). For example, an area might specialize in computing motion due to computing subtractions of a motion coincidence detector regardless of sensory input. If this is true then all these areas were always multisensory, possibly with visual dominance since it is perhaps the most reliable sensory input in the sighted. Alternatively, the task specificity might stem from the different connectivity pattern of each area. In our case, the visual streams may differ in their connectivity pattern to other cortical areas, which together drive their task-selectivity organization (e.g., via top-down modulation). For example, it has been suggested that premotor–posteromedial parietal connections are likely to subserve abstract cognitive processes involving visuospatial information in the precuneus (Cavanna and Trimble 2006), while feedback connectivity from frontal and somatosensory cortices to the ventral (inferior temporal) occipital cortex may underlie its multisensory function for object recognition (Amedi et al. 2001, 2003; Deshpande et al. 2008). In addition to the preexisting connectivity, connectivity between the visual cortex and other sensory cortices may also be strengthened by sensory deprivation (e.g., between A1 and V1; Klinge et al. 2010). Thus, although the input in our case was auditory rather than visual, the preserved functional connectivity of each stream still dictates development toward processing shape or location, which may even be strengthened for the nonvisual modalities.
This type of top-down modulation based on the existing connectivity pattern might also originate from the corresponding auditory streams. Similar to the visual streams, auditory processing is also divided into what and where pathways, whose functional and anatomical segregation has been thoroughly validated in many species, including humans (Pandya and Vignolo 1969; Romanski et al. 1999; Kaas and Hackett 2000; Rauschecker and Tian 2000; Alain et al. 2001; Kubovy and Van Valkenburg 2001; Kraus and Nicol 2005; Lomber and Malhotra 2008; van der Zwaag et al. 2011). Some selective activation of these auditory streams is also seen in our contrasts in addition to the visual cortical streams. While our results do not clearly show the auditory division of labor between the rostral and caudal parts of the early-stage auditory areas on the supratemporal plane (which can better be depicted by ultra high-field 7-T scanners due to the relatively small size of the auditory areas and the integrated what and where processing of some of these regions; Griffiths and Warren 2002; van der Zwaag et al. 2011), we do, however, find evidence for the stream differentiation in the inferior parietal lobe (supramarginal gyrus and even post STG) and frontal lobe (between the inferior and superior what and where regions), in accordance with the auditory stream division. Therefore, it may be speculated that the same auditory connectivity may partially underlie the visual cortex differentiation and selectivities found in the current report. However, previous studies of purely auditory localization and object identification in the sighted or blind (as opposed to visual-to-auditory processing using an SSD) have not shown clear and consistent activation of the ITS (Amedi et al. 2002, 2007) or precuneus (Collignon et al. 2009, although the precuneus may sometimes be activated by nonvisual localization in sighted; Renier et al. 2009). An exception to this may be the MOG, which shows selective activation for auditory localization in the blind (Collignon et al. 2007; Renier et al. 2010). This suggests that the main occipital cortical regions shown here (ITS and precuneus) are less likely to partake in the auditory processing streams per se but rather with processing computations that resemble vision (e.g., object shape). Thus, their roles are less likely to develop as regular parts of the auditory streams.
Interestingly, one difference between the shape and location activation was the lack of more posterior recruitment of the dorsal stream even in the CB (Figs 1 and 2B–D). One possible explanation is that the location task was simply easier (possibly resulting in less activation for this task). However, the replication of the group results in our subgroup of 5 subjects, in the larger mixed group (n = 12) and critically, in the large group of the CB (n = 7, RFX analysis), who had similar performance on both tasks, as well as in all the individual subjects comprising these groups (Figs 3 and 4, Supplementary Fig. 3), rules out this explanation. Alternatively, these differences might be due to different developmental timelines of the 2 streams. Although both streams may be not only visual but task-specific and sensory-input independent to some extent, the dorsal stream matures earlier in development (Lewis and Maurer 2005), whereas it has been shown that the ventral stream may continue to develop until adolescence (Golarai et al. 2007). This suggests that the ventral stream may continue to be plastic later in life (e.g., have longer sensitive periods) relative to dorsal stream regions and thus be more likely to reorganize differently and adaptively (e.g., as seen in our study, to sounds). Supporting this notion, previous studies have shown robust changes of the ventral visual cortex to other nonvisual functions such as language and memory (Sadato et al. 1996; Cohen et al. 1997; Röder et al. 2001, 2002; Burton, Snyder, Conturo, et al. 2002; Burton, Snyder, Diamond, et al. 2002; Amedi et al. 2003; Pascual-Leone et al. 2005; Noppeney 2007; Bedny et al. 2011). Moreover, studies of sensory restoration after long-term visual deprivation (Fine et al. 2003; Gregory 2003; Ostrovsky et al. 2006, 2009) suggest that ventral stream visual functions may remain deficient after visual peripheral recovery even after months of training following the procedure (perhaps due to the aforementioned robust plastic cross-modal changes), whereas motion perception (a dorsal stream function, processed in the early-maturing MT; Lewis and Maurer 2005) appears to recover almost immediately following sensory restoration.
Does this condemn the critical ventral stream functions to remaining largely deficient following sight restoration? While previous studies (Fine et al. 2003; Gregory 2003; Ostrovsky et al. 2009) indeed show very serious deficits in object shape recognition and segregation from background that might hinder sight restoration efforts regardless of the exact clinical/technological approach, our results imply that the ventral stream could hypothetically be shifted back toward its original task preference. The preferential activation of the posterior ventral stream for shape in the CB (Figs 1 and 2, Supplementary Fig. 2) suggests that while visual deprivation may modify the role of these regions, the general stream preference remains and can be revealed after learning to extract the relevant information from other modalities (in our case audition). The adaptation of the blind to processing auditory information more than the sighted may even result in quicker recruitment of cross-modal visual shape processing transmitted by SSD. Although our subjects were not studied under a clinical sight restoration protocol (in the more conventional sense of restoration of visual qualia using, e.g., retinal prostheses; Dowling 2008), we feel it is important to discuss our results in the context of clinical settings and to speculate on their putative importance. For example, future work should examine whether this unique combination of sensory-input independent organization, baseline biases to shape and location, and longitudinal long-term plasticity enables the use of SSDs as neuro-rehabilitative aids to train the visual cortex to analyze visual information. Therefore, SSDs can theoretically support visual rehabilitation both before such procedures, for example, to help reprogram or awaken the hypothesized visual streams to selectivity process “vision” (rather than language and memory; Sadato et al. 1996; Cohen et al. 1997; Röder et al. 2001, 2002; Burton, Snyder, Conturo, et al. 2002; Burton, Snyder, Diamond, et al. 2002; Amedi et al. 2003; Pascual-Leone et al. 2005; Noppeney 2007; Bedny et al. 2011; Striem-Amit, Bubic, et al. 2011) and then by serving as a “sensory interpreter,” providing explanatory input to the novel visual signal arriving from an alien invasive device when it is first introduced to the visually restored individual. Interestingly, in a recent study, the more intact and faster recovering dorsal stream functions (e.g., detecting moving stimuli) were successfully used to train the deficient ventral stream functions (visual parsing and object recognition) in blind individuals who regained sight through medical intervention using shape-from-motion training (Ostrovsky et al. 2009). SSDs could thus be used in a similar way to train the visual cortex via other modalities.
To conclude, this study shows that visual experience is not necessary in order for the dorsal–ventral division of labor within the visual system to emerge, at least to some extent. This suggests the operation of innately determined constraints on the emergence of the most important large-scale organization of the visual cortex. Our results favor the view that these preferences are determined, in part, by dimensions of domains of knowledge or task similarity that cannot be reduced to the visual experience of individuals (e.g., Mahon and Caramazza 2009; Renier et al. 2010, 2011). Finally, our results support the notion that large parts of the visual system are task-specific modality invariant in nature and can be accessed, via cross-modal mechanisms, by any sensory modality.
International Human Frontiers Science Program Organization Career Development Award (CDA-0015/2008-070509 to A.A.); EU-FP7 MC International Reintegration grant (MIRG-CT-2007-205357-250208 to A.A.); James S. McDonnell Foundation scholar award (220020284 to A.A.); Israel Science Foundation (ISF 1684/08); The Sieratzki family award (to A.A.); Vision Center grant from the Edmond and Lily Safra Center for Brain Sciences (to A.A.).
We thank D.R. Chebat and A. Bubic for their in-depth review of the final draft of the paper and other very useful discussions. We would also like to thank the Hebrew University Hoffman Leadership and Responsibility Fellowship Program support (to E.S.A.) and the Samuel and Lottie Rudin Foundation support (to L.R.). Conflict of Interest : None declared.