Many empiricist theories hold that concepts are composed of sensory–motor primitives. For example, the meaning of the word “run” is in part a visual image of running. If action concepts are partly visual, then the concepts of congenitally blind individuals should be altered in that they lack these visual features. We compared semantic judgments and neural activity during action verb comprehension in congenitally blind and sighted individuals. Participants made similarity judgments about pairs of nouns and verbs that varied in the visual motion they conveyed. Blind adults showed the same pattern of similarity judgments as sighted adults. We identified the left middle temporal gyrus (lMTG) brain region that putatively stores visual–motion features relevant to action verbs. The functional profile and location of this region was identical in sighted and congenitally blind individuals. Furthermore, the lMTG was more active for all verbs than nouns, irrespective of visual–motion features. We conclude that the lMTG contains abstract representations of verb meanings rather than visual–motion images. Our data suggest that conceptual brain regions are not altered by the sensory modality of learning.
The 18th century British Empiricist, George Berkeley, proposed that all concepts are composed of sensory experiences. As a consequence, he believed that the concepts of congenitally blind individuals are fundamentally different from concepts of the sighted (Berkeley 1709/1732). In modern cognitive neuroscience and psychology, there is a spectrum of disparate views on the relationship of sensory experience and concepts. At one end of the spectrum, concept retrieval is viewed largely as the reactivation of sensory–motor experiences (e.g., Pulvermuller 1999; Barsalou et al. 2003; Gallese and Lakoff 2005). In some accounts, sensory–motor representations may be bound together by nonsensory brain regions, but the representational structures themselves are modality specific (Allport 1985; Barsalou et al. 2003; Damasio et al. 2004; Gallese and Lakoff 2005; Barsalou 2007). For example, the concept “run” is made up (in part) of a visual image of running stored in visual cortex. Like British Empiricism, these views predict that the concepts of congenitally blind individuals differ from those of the sighted in that they lack the visual component.
At the opposite end of the spectrum, concepts are viewed as modality independent. Concepts, on these accounts, are stored in nonperceptual brain regions and organized according to conceptual rather than perceptual dimensions (e.g., Potter and Faulconer 1975; Caramazza et al. 1990; Rogers et al. 2004; Bedny et al. 2008). These accounts predict that conceptual representations of congenitally blind adults should be similar to those of the sighted. Although congenitally blind individuals have never seen, and their visual regions are profoundly reorganized (e.g., Amedi et al. 2004), their conceptual representations should be relatively unchanged.
We tested these views by studying lexicalized action concepts—specifically the meanings of action verbs. Many empiricist views hold that action verb meanings include visual–motion features (Tranel et al. 2003; Meteyard et al. 2007; Meteyard et al. 2008; Revill et al. 2008). Apparently consistent with this prediction, comprehension of action verbs engages posterior aspects of the left middle temporal gyrus (lMTG) in the proximity of visual–motion regions (motion sensitive area (MT/MST) and the left homologue of the right superior temporal sulcus [rSTS]) (e.g., Martin et al. 1995; Kable et al. 2002; Davis et al. 2004; Tettamanti et al. 2005; Bedny et al. 2008). For example, Tettamanti et al. (2005) found greater activity for action sentences than abstract sentences near the lMTG (however, for a study that did not find the same effect, see Tomasino et al. 2007). This activity has been taken to reflect retrieval of visual–motion features during action verb comprehension (e.g., Martin et al. 1995; Martin and Chao 2001; Kable et al. 2002; McClelland and Rogers 2003; Tranel et al. 2003; Kemmerer et al. 2008; Noppeney 2008; Revill et al. 2008).
There is one striking pieces of evidence that is inconsistent with the idea that the lMTG stores visual–motion features of actions: When deciding whether hand actions involve a tool, lMTG activity is high in individuals that have never seen (i.e., congenitally blind adults) (Noppeney et al. 2003). These data suggest that visual–motion experience is not necessary for the lMTG to be engaged in action concepts. Somewhat surprisingly though, the same study did not find increased lMTG activity for motion words that describe whole-body movements, in either sighted or congenitally blind individuals. There are a number of possible explanations of these data. The lMTG could represent visual–motion features of actions, but hand actions (or actions with tools) might be selectively preserved in congenitally blind adults due to their motor and tactile associations. Alternatively, lMTG activity could be preserved in congenitally blind adults because the lMTG does not represent visual–motion information in sighted or blind individuals (Bedny et al. 2008). Rather, the lMTG may represent abstract conceptual or grammatical features of action verbs.
In the present study, we measured blood oxygenation level–dependent (BOLD) signal in blind and sighted adults while they performed a semantic judgment task with action verbs as well as 2 other categories of verbs (mental and change of state verbs) and 3 categories of nouns that varied in visual–motion features (animals, artifacts, natural inanimate objects) (Bedny et al. 2008). Thus, we tested whether lMTG represents visual–motion features of action verb meanings by testing 2 predictions 1) If the lMTG stores “visual” features, this region should be absent or altered in congenitally blind adults and 2) if the lMTG stores “motion” features, it should respond more to words with high-motion associations (i.e., actions and animals) than those with low motion associations (e.g., mental verbs and inanimate objects).
Materials and Methods
Twenty-one sighted adults (8 females, mean age 52 years, standard deviation [SD] 11) and 10 congenitally blind adults (6 females, mean age 49 years, SD 9) participated in this experiment. One sighted participant's data were excluded from analyses because he was unable to perform the task. Blind and sighted participants had the same average years of education (mean 17, SD 2) (see Supplementary Table 2). All blind participants reported having at most faint light perception from birth and had lost their vision due to pathology in or anterior to the optic chiasm. None of the participants suffered from neurological disorders or had ever sustained head injury. This study was approved by the institutional review board. All subjects gave informed consent and were compensated $30 an hour.
While undergoing functional magnetic resonance imaging (fMRI), participants heard pairs of words over headphones. Participants indicated how related in meaning, the words were on a scale of 1–4 by pressing buttons on a respond pad. Five word pairs from one condition made up a block. Blocks were 18 s long and were separated by 14 s of fixation. The experiment was broken up into 5 runs of 7.7 min each.
Participants heard 50 word pairs per category. Each word was presented twice during the experiment but paired with a different word for the second presentation. In a control condition, participants heard pairs of backwards speech sounds and performed an auditory similarity judgment task. Backwards speech sounds were created by digitally reversing the word stimuli, rendering them unintelligible.
Word stimuli consisted of 50 words in each of the following categories: high-motion verbs (action); intermediate motion verbs (change of state and bodily function); low-motion verbs (mental); high-motion nouns (animals); intermediate motion nouns (tools); and low-motion nouns (inanimate natural). Visual–motion ratings were obtained from a separate group of sighted participants. Semantic categories and verbs and nouns were matched on familiarity, frequency as well as length in syllables and phonemes (Coltheart 1981). Due to a technical error, behavioral data were only recorded for 6 of the 10 blind subjects and 13 of the 20 sighted subjects (all analyses of variance [ANOVAs] n = 19). For further details on the procedure and stimuli, see Supplementary Material and Bedny et al. (2008).
Structural and functional data were collected on a 3-T Siemens scanner at the Martinos Imaging Center at the McGovern Institute for Brain Research at the Massachusetts Institute of Technology. For details of fMRI data acquisition, see Supplementary Material.
T1-weighted structural images were collected in 128 axial slices with 1.33 mm isotropic voxels (time repetition [TR] = 2 ms, time echo [TE] = 3.39 ms). Functional BOLD data were acquired in 3 × 3 × 4 mm voxels (TR = 2 s, TE = 30 ms), in 30 near axial slices. The first 4 s of each run were excluded to allow for steady state magnetization. Data analysis was performed using SPM2 (SPM2; http://www.fil.ion.ucl.ac.uk/) and in-house software. The data were realigned, smoothed with a 5 mm smoothing kernel, and normalized to a standard template in Montreal Neurological Institute space.
BOLD signal differences between conditions were evaluated through second-level random-effects analysis. In whole-brain analyses, the modified linear model was used to analyze BOLD activity of each subject as a function of condition. Covariates of interest were convolved with a standard hemodynamic response function. Nuisance covariates included run effects, an intercept term, and global signal. Time series data were subjected to a high-pass filter (1 cycle/128 s). The false positive rate was controlled at α < 0.05 (corrected) by performing Monte Carlo permutation tests on the data (Nichols and Holmes 2002; Hayasaka and Nichols 2004). Region of Interest (ROI) analyses were performed on the average of percent signal change from TR 3 through 9 relative to a rest baseline. (The first 2 TRs were excluded to account for the hemodynamic lag; for e.g., of similar analyses, see Saxe et al. 2006; Baker et al. 2007.) Functional ROIs were identified in individual subjects based on orthogonal contrasts. For the purposes of defining ROIs, contrasts were thresholded in individual subjects at a voxelwise threshold of P < 0.001 uncorrected, with a cluster threshold of k ≥ 10 contiguous voxels. If no regions were observed at this threshold, the threshold was lowered to P< 0.01. If no regions were observed at the lowered threshold, the subject was excluded from that analysis.
There were no condition, group, or interaction effects in average similarity ratings (2-by-6 ANOVA, all Ps ≥ 0.3). To assess whether blind and sighted individuals have the same intuitions about which words are similar in meaning, we correlated each groups' ratings to the ratings of an independent group of young sighted subjects. We were specifically interested in whether the ratings of blind individuals are less similar to those of sighted people for categories that might include visual features in their meanings: concrete words (e.g., action verbs relative to thought verbs) or specifically concrete nouns (e.g., animal nouns relative to verbs). The correlation plots are presented in Supplementary Figure 1. The ratings of each individual blind participant were reliably correlated with those of young sighted group for every category (Ps < 0.05). The ratings of the blind group and the older sighted group were reliably and equally correlated to the ratings of young sighted adults for every category (across categories, blind to young sighted r2 = 0.53, P< 0.0001; older sighted to young sighted r2 = 0.62, P< 0.0001; for r2 values for each category, see Supplementary Table 1). Moreover, the residuals of the correlations from the older sighted with the younger sighted and the early blind with the younger sighted were highly correlated (r2 = 0.53, P< 0.0001), indicating that the items that differed among the young sighted and the older sighted were the same as those that differed among the young sighted and the blind. Differences among groups therefore reflect age or cohort effects and not effects of blindness.
Sighted and blind participants were faster to respond to noun pairs than verb pairs (F1,14 = 28, P< 0.0001). The group-by-condition interaction was not reliable (F1,14 = 2.89, P = 0.11). Overall the groups did not differ from each other in reaction time (F1,14 = 0.03, P = 0.87). Reaction times for verb categories and noun categories did not differ among themselves nor did word pairs differ from backwards speech (no effect of condition, group, or group-by-condition interaction Ps > 0.1). (Average similarity ratings and reaction times are summarized in Supplementary Table 1.)
Do Congenitally Blind and Sighted Adults Engage the lMTG When They Understand Action Verbs?
To determine whether blind adults had an lMTG region that responded to action words, we compared activity for action verbs (the highest motion verb category) to natural inanimate objects (the lowest motion noun category) using whole-brain analysis. In this contrast, sighted adults had greater activity in an lMTG region that was situated on the STS on the left and extended into the superior temporal gyrus (STG) (−56, −49, 6; −62, −50, 12; k = 51; t = 5.72). A similar focus of activation was found in the group of blind adults (−44, −58, −2; k = 49; t = 8.75). The lMTG region in the blind group extended from the left middle temporal gyrus into the inferior temporal gyrus (see Fig. 1A). There were no other regions that were more active for action verbs than for inanimate natural objects in either group. Critically, there were no regions that were more active for action verbs than inanimate objects in sighted but not blind adults (no group-by-condition interaction).
What Is the Functional Profile of the lMTG? Does the lMTG Distinguish between High- and Low-Motion Words or between Verbs and Nouns?
To determine whether the lMTG represents high-motion words, or alternatively whether this region represents verbs irrespective of visual–motion information, we performed whole-brain random-effect analyses testing first for a “visual–motion effect” and then a “verb/noun” effect.
To test for an effect of motion features, we compared high-motion nouns and high-motion verbs to low-motion nouns and low-motion verbs. This contrast did not reveal any significant voxels in either sighted or blind adults.
One possibility is that motion features are specifically important for action concepts (and not animals, the high-motion nouns). We therefore also compared action verbs (high-motion verbs) to mental verbs (low-motion verbs). No regions responded more to high-motion verbs than to low-motion verbs in either group at a corrected threshold of P< 0.05. No voxels were found in the lMTG or the surrounding left lateral temporal lobe in either group, even when the threshold was lowered to 0.001, k = 10 uncorrected. (This uncorrected threshold does, however, reveal activation elsewhere; see below.)
Next we tested the hypothesis that the lMTG represents verbs (verb/noun effect). We compared the lowest motion verb category (thought verbs) to the highest motion noun category (animal nouns). Animal nouns have higher visual–motion ratings than thought verbs. Nevertheless, in sighted adults, there was greater activity in the lMTG (Broadmann Area [BA] 22) for thought verbs than for animal nouns. This activity extended from the lMTG into the STG (−64, −44, 20; −52, −42, 2; −64, −42, 4). In this contrast, we also observed activity in the homologous region on the right (60, −38, 8; 48, −36, 4) as well in the LIFG (−56, 18, −2) extending from BA47 through BA45 and BA9. These results suggest that the lMTG responds to verbs rather than high-motion words. This same contrast did not reach a corrected level of significance in blind individuals (possibly due to the smaller sample size of blind participants). However, the lMTG (−52, −48, 2) and right MTG (58, −42, −2) regions were present in this group at a threshold of P < 0.0005, k = 10, uncorrected. There was also activation in the left anterior temporal lobe at this threshold (−54, 6, −26; −60, −8, −16). No brain region showed a group-by-condition interaction.
We examined the difference between low-motion verbs and high-motion nouns in greater detail using ROI analyses. We identified the lMTG verb region by comparing the 2 highest motion verb categories to the 2 lowest motion noun categories. Using this contrast, we were able to identify a region in the posterior lMTG for 9/10 of our blind participants (−57 SD 7, −54 SD 9, X Y Z coordinates 4 SD 5) and in 17/20 of our sighted participants (−56 SD 6, −56 SD 9, 3 SD 5). The size (k), location (X, Y, Z), and significance (t) of the lMTG region did not differ across groups (all P > 0.3; blind k = 100 SD 121, t = 5.02 SD 1.43; sighted k = 76 SD 74, t = 5.18 SD 1.92). In this lMTG ROI, we compared the low-motion verbs to high-motion nouns (thought verbs > animal nouns) and to backwards speech (these comparisons are orthogonal to the ROI definition). The lMTG ROI responded more to low-motion verbs than to high-motion nouns in both blind and sighted adults (n = 26 for all analyses of this ROI, main effect of condition F2,48 = 21.6, P< 0.0001). The size of this effect did not differ across groups (main effect of group F2,24 = 1.1, P = 0.30; group-by-condition interaction F1,48 = 1.93, P = 0.16). In post hoc comparisons, low-motion verbs produced a higher response than backwards speech in both groups (P < 0.05) BOLD signal for backwards speech and high-motion nouns did not differ from each other in either group (Tukey's Honestly Significant Difference, P < 0.05) (Fig. 1B).
The lMTG response was not predicted by the difficulty of the semantic judgments. Across participants, there was no relationship between the size of the thought verb > animal noun difference in the lMTG and the reaction time difference between these categories (r2 = 0, P > 0.3). The effect of condition (thought verb > animal noun) on the lMTG response remained highly significant, even when reaction time differences were included as a covariate (F1,13 = 13.5, P = 0.003). By contrast, reaction time was not significantly related to lMTG activity (F1,21 = 1.4, P = 0.25). (Only participants for whom reaction time data were recorded are included in these analyses.)
Is the Functional Profile of the lMTG Similar in Blind and Sighted Adults?
We compared the functional profile of the lMTG in sighted and blind adults. One prediction of an empiricist view could be that although blind individuals have an lMTG region that responds to action verbs, its response is altered. To address this question, we compared the response of the lMTG to high, intermediate, and low-motion categories of verbs across blind and sighted participants.
For the purposes of this analysis, we functionally localized voxels that were more active for all verbs than all nouns in the posterior aspect of the lMTG in each of our blind and sighted participants individually (all verbs > all nouns; ROI identified in 9/10 blind and 18/20 sighted participants). The lMTG region thus identified did not differ in either size, significance, or location among sighted and blind adults (sighted average peak [−58 SD 6, −50 SD 12, 6 SD 8] and blind average peak [−59 SD 5, −52 SD 10, 4 SD 4]) (all Ps > 0.3). Moreover, the peak location of this region was not different from a peak identified comparing the high-motion verbs to the low-motion nouns (P > 0.3).
In the lMTG identified by the verbs > nouns contrast, we compared BOLD signal using a 2 × 3 ANOVA with group (blind vs. sighted) and condition (low-motion verbs, medium motion verbs, high-motion verbs) as factors. (Note that the comparisons of verbs to each other is orthogonal to the contrast of verbs > nouns.) There was a main effect of verb type (F2,40 = −5.52, P< 0.008). However, there was no effect of group (F1,20 = 0.54, P = 0.47) and no group-by-condition interaction (F2,40 = 0.28, P = 0.76). Post hoc comparisons revealed greater activity for the low and medium motion verbs than for high-motion verbs (P< 0.05, Tukey's HSD) (Fig. 2B). These results demonstrate that the lMTG has a similar response profile in blind and sighted adults, and that it responds more to low-motion than high-motion verbs. This preserved response was observed despite reorganization in early visual regions (for discussion of group-by-condition effects in pericalcarine cortex, see Supplementary Material).
Are Action Representations Outside of the lMTG Altered in Congenitally Bind Adults?
To specifically investigate the role of motion features in verb meaning (as described above), we compared high-motion and low-motion verbs using whole-brain analysis. In sighted adults, there were no significant activations at a corrected threshold of P< 0.05. When the threshold was lowered to a lenient level of uncorrected P< 0.001, k = 10, we observed activity in the left inferior parietal lobule (IPL) (−60, −34, 36; t = 5.18; k = 37), bilateral medial fusiform gyrus (−38, −38, −20; t = 5.17; k = 38 and 32, −34, −26; t = 4.43; k = 10), the left cingulate gyrus (10, −32, 44; t = 4.69; k = 19), and right precentral gyrus (48, −6, 36; t = 4.23; k = 15). Of these activations, the left IPL was the only brain region that other studies had previously found to respond to high-motion words. The left IPL is also involved in higher order motion perception (Kellenbach et al. 2003; Noppeney et al. 2005). We therefore examined IPL activity further using ROI analysis.
We created a group and condition independent ROI by drawing a 5-mm sphere around the peak of activation in the left IPL reported by Nopenney et al. (−58, −36, 36). In a multiple regression analysis of PSC, we observed a reliable main effect of motion (F1,146 = 10.63, P = 0.001), no effect of grammatical class (F1,146 = 2.27, P = 0.13), and a trend toward an interaction between motion and grammatical class (F1,146 = 2.26, P = 0.10; the effect of motion was somewhat larger for nouns). There were no effects of group (F1,28 =1.11, P = 0.30) or group-by-motion interaction (F1,146 = 0.56, P= 0.46). In summary, our IPL analysis replicated prior reports of a small but reliable motion effect in the IPL during word comprehension, which occurred for both nouns and verbs, and we found that this effect was preserved in congenitally blind individuals.
Consistent with these results, whole-brain analyses revealed no brain regions that were more active in the sighted than the blind in this contrast, even at a lenient threshold of P< 0.001, k = 10 (group-by-condition interaction).
Strong empiricist theories suggest that action concepts are composed, in part, of visual–motion features (Pulvermuller 1999; Barsalou et al. 2003; Gallese and Lakoff 2005; Boulenger et al. 2009). These and other sensory features are said to be spontaneously activated when we understand words (e.g., Hauk et al. 2006; Willems et al. 2010). Activation in left posterior temporal lobe during action verb comprehension has been taken as evidence for this view (e.g., Martin et al. 1995; Kable et al. 2002; McClelland and Rogers 2003; Tranel et al. 2003; Tettamanti et al. 2005; Patterson et al. 2007). We find that lMTG is indeed spontaneously active during verb comprehension but that lMTG does not represent visual–motion features. First, the lMTG is unchanged in individuals who have never seen. In this regard, our findings extend the prior work of Noppeney et al. (2003) who observed greater activity for hand actions than “visual” and “sound” words. We find that the neuroanatomical location, size, and response profile of the lMTG is identical in congenitally blind and sighted adults. Crucially, we find that in both blind and sighted people, the lMTG is recruited for 3 classes of verbs (high, medium, and low motion) and showed no response to 3 classes of nouns. Amount of motion information did not predict lMTG activity either for verbs or for nouns. Together these data demonstrate that indeed some component of verb meanings is spontaneously activated in the lMTG, but these representations are neither specifically visual features nor specifically motion features.
lMTG Activity Does Not Reflect Retrieval of Visual–Motion Features
There are 2 possible concerns with this conclusion from our data. First, perhaps visual–motion representations in the lMTG are normally retrieved during language comprehension, but such retrieval was inhibited by our task. Second, perhaps visual–motion features stored in the lMTG are retrieved by sighted people, but blind people retrieve lMTG motion information of another modality. We consider these 2 concerns in turn.
Could our task have inhibited activation of visual–motion features that are normally retrieved during action verb comprehension? A wealth of behavioral data has shown that the meanings of words are automatically retrieved when speakers listen to or read words in their native language, irrespective of task (e.g., Stroop 1935; Neely 1991). Therefore, if visual–motion features are an integral part of these meanings they should be activated automatically. If anything, though, our particular task should make semantic processing of the word meanings more likely. Our participants judged the semantic similarity of word pairs, within category (e.g., “to run—to kick,” “to hop—to jump”). Such within category judgments rely on retrieving detailed aspects of the verb meaning (Kemmerer and Gonzalez-Castillo 2010).
The present task produced activation in the same part of the lMTG as has been observed in a wide range of other semantic tasks: semantic-triad judgments, synonym judgments, and action generation to objects (e.g., Martin et al. 1996; Kable et al. 2002, 2005; Davis et al. 2004). The same functional pattern is observed even when judgments about action verb meanings are based on their manner of motion (Kable et al. 2005). Thus, our task appears to recruit a common neural substrate of action verb comprehension.
Could the lMTG store visual–motion representations in the sighted, but a different kind of motion representation in the blind? On this view, lMTG in the blind might represent either another sensory modality of motion (e.g., auditory or motor) or a modality-independent spatiotemporal representation of motion. The lMTG responds more to low-motion thought verbs than to high-motion action verbs in both sighted and blind groups. Also, the lMTG response to high-motion nouns (names of animals) is no higher than to backwards speech (see also Grossman et al. 2002; Bedny et al. 2008). These data are inconsistent with the hypothesis that the lMTG stores motion representations in any modality, in either sighted or blind people. However, our data do not rule out the possibility of a privileged relationship between the representations of the lMTG and the motion perception system either evolutionarily or developmentally (Mahon et al. 2009).
Does Action Verb Comprehension Evoke Visual–Motion Representations Outside of lMTG?
Could visual–motion features be recruited during word comprehension, in a different brain region rather than the lMTG? The most obvious candidate would be the visual–motion region, MT/MST. However, multiple studies have explicitly investigated MT/MST recruitment during word comprehension tasks and do not find increased activity in these regions (Kable et al. 2002; Kable et al. 2005; Bedny et al. 2008). Rather than in MT/MST, activity is observed in lMTG, even when participants make judgments about the similarity of action verbs based on manner of motion (Kable et al. 2005). Nor is activity observed in the right STS (rSTS), which is involved in biological motion perception (Grossman et al. 2000; Grossman et al. 2005; Bedny et al. 2008). We do find some evidence that parietal spatiotemporal representations of motion are active during word comprehension. Consistent with prior studies, a region within the parietal lobe (the IPL) responded to high-motion words (Kellenbach et al. 2003; Noppeney et al. 2005). However, the motion representations of the IPL are likely multimodal or spatiotemporal rather than visual. The parietal lobe contains several types of multimodal spatial representations (Andersen et al. 1997; Grefkes and Fink 2005). Unlike lower-level visual–motion regions like MT/MST, the IPL responds to motion in multiple modalities (i.e., both visual and auditory motion) (Lewis et al. 2000). In our own data, we observed a similar IPL motion response in the sighted and the congenitally blind. So absence of visual experience does not alter IPL motion representations (see also Mahon et al. 2010). Together, these data suggest that interactions between word comprehension and perception may occur at the level of spatial and multimodal representations rather than modality-specific representations. Consistent with this claim, during motion perception, activity in the IPL, but not in MT/MST, is modulated by linguistic context (Lewis et al. 2000; Sadaghiani et al. 2009). We hypothesize therefore that the IPL, and not MT/MST, or the lMTG may mediate previously reported behavioral interactions between action verb comprehension and motion perception (Meteyard et al. 2007, 2008).
Future studies will have to confirm the role of the IPL in mediating interactions between language and vision. The parietal response observed in the present study was weaker than the lMTG response. Furthermore, other studies of action verb processing have not observed a response in the same part of the IPL and find no parietal response at all for some classes of action verbs (e.g., Kemmerer et al. 2008). Finally, since we did not identify motion responsive IPL areas in individual participants, it is still possible that distinct parietal areas respond to perceptual motion and word comprehension.
Although we find that modality-specific visual representations are not retrieved during word comprehension, such representations may be engaged during other conceptual and linguistic tasks. For example, visual–motion representations can be retrieved based on verbal stimuli when the task involves visual imagery (Goebel et al. 1998; Grossman and Blake 2001). There is some evidence that MT/MST and the rSTS are activated when participants are presented with sentences or passages that describe motion (Saygin et al. 2009; Deen and McCarthy 2010). Additionally, one paper has suggested that a region anterior to MT/MST and perhaps partially overlapping with MT/MST is activated when participants match newly learned nonsense words with associated visual–motion events (Revill et al. 2008). Therefore, modality-specific visual representations may be optionally retrieved or generated during some verbal tasks.
Whether such sensory or spatiotemporal representations are considered a part of action concepts or action verb meanings is in part a theoretical question. While some theories hold such information is external to word meanings (e.g., Jackendoff 1975–2010; Bierwisch and Schreuder 1992), others consider spatial or spatiotemporal representations to be a distinct but integral parts of action verb meaning (e.g., Kemmerer and Gonzalez-Castillo 2010). Here we present evidence that representations retrieved automatically during action-verb comprehension are not modality-specific ‘images’ of visual-motion.
lMTG Activity Reflects the Retrieval of Abstract Semantic or Grammatical Features
What is the nature of the spontaneously retrieved and modality-independent representations of the lMTG? Along with a number of prior studies, we find that the lMTG responded more to verbs than nouns (Perani et al. 1999; Grossman et al. 2002; Davis et al. 2004; Shapiro et al. 2006; Yokoyama et al. 2006; Palti et al. 2007; Bedny et al. 2008). This basic observation suggests several possible functions for the lMTG. First, the lMTG may represent the kinds of concepts verbs tend to refer to: events, states, and relations, as opposed to entities (Frawley 1992). These concepts may be specifically linguistic or also accessible during nonlinguistic conceptual tasks (Potter and Faulconer 1975; Potter et al. 1977; Jackendoff 1999). The conceptual representations may be schematic or may be abstract but highly detailed; for example, patients with LMTG damage are impaired in subtle semantic judgments about both verbs and pictures of actions (Tranel et al. 2003, Kemmerer et al. 2010). Second, it is possible that the semantic information stored in the lMTG is not particular to any domain of concepts like events. Rather, the lMTG may respond to verbs because these have a richer or more complex semantic structure than nouns, on average. If so, one should also observe increased lMTG activity for semantically complex words that do not refer to events. Third, the lMTG may represent grammatical information relevant to verbs.
What kind of grammatical information might the lMTG represent? The lMTG could represent information about how verbs are inflected (morphosyntax). This view seems unlikely because lMTG activity for verbs has been observed in semantic tasks even with minimal morphosyntactic demands. The verbs are often not inflected nor are subjects required to inflect them (e.g., Martin et al. 1995). More plausibly, the lMTG may represent information about the way a verb behaves in sentences—its argument structure (Shetreet et al. 2007; den Ouden et al. 2009; Snijders et al. 2009). In the present study, richness of argument structure did not predict lMTG activity (as measured by the number of subcategorization frames or the number of arguments per frame; Supplementary Figure 2 and Supplementary Results). There is, however, some evidence that the lMTG may respond to the argument structure when verbs are processed in a sentence context (Shetreet et al. 2007; den Ouden et al. 2009; Snijders et al. 2009). Because argument structure is correlated with verb meaning, such evidence is consistent with the possibility that the lMTG represents either conceptual or grammatical information. For example, the verb “put” has 3 arguments; thus the sentence “Yesterday Mary put.” is not felicitous. Parallel to this syntactic behavior, the concept of putting involves 3 entities, an agent that puts something somewhere (Fisher et al. 1991; Levin 1993; Jackendoff 1999; Pinker 2007). An intriguing possibility is that lMTG represents the kind of conceptual information that is relevant to syntax (Jackendoff 1999). Future work is clearly required to clarify what sort of conceptual or grammatical information the lMTG represents.
The Role of Sensory Experience in the Development of Conceptual Brain Regions
We find that the representations activated during word comprehension are not altered by congenital blindness. The preserved neural substrates of action verb comprehension stands in contrast to the striking plasticity in sensory brain regions following early changes in sensory experience. For example, the location and functional profile of visual brain regions is changed by congenital blindness from vision to audition and touch and even to higher cognitive domains (Amedi et al. 2003; Hensch 2005; Pascual-Leone et al. 2005; Merabet et al. 2007; Noppeney 2007). Our data suggest that these sensory changes do not carry forward into conceptual systems (for similar arguments, see Bedny et al. 2009).
It is nevertheless still possible that there are privileged relations during development, between conceptual domains and specific sensory–motor systems. For example, the perception of motion in some modality may be required for normal development of lMTG. These relations might be bidirectional, including the possibility that the perceptual systems are partly organized along conceptual domains (Caramazza and Mahon 2003, Mahon and Caramazza, 2008).
What do these data from congenitally blind adults tell us about how conceptual brain regions are shaped by experience? One possible conclusion is that conceptual brain regions do not exhibit experience-dependent plasticity. We think it is unlikely, though, that conceptual brain regions are physiologically different from perceptual regions in their potential for plasticity. We therefore favor the interpretation that both conceptual and perceptual brain regions could exhibit experience-dependent plasticity, but the development of conceptual brain regions is robust to the absence of vision in particular. That is, the experience of blind children is not different from that of sighted children in ways that matter for the formation of brain regions involved in understanding action verbs (Landau and Gleitman 1985; Gillette et al. 1999; Noppeney et al. 2003; Ricciardi et al. 2009). In this regard, our results are consistent with behavioral work showing largly similar semantic representations in blind individuals even for words that refer to visual experiences such as color names and verbs of seeing (Marmor 1978; Landau and Gleitman 1985; Shepard and Cooper 1992). The present neural data further suggest that at least in the case of action verbs, not only the content but also the format of semantic representations is similar in sighted and blind adults. LMTG verb representations might however be influenced by linguistic experience (Gillette et al. 1999). More generally, changes in higher order aspects of experience (e.g., linguistic, social, causal) may affect the development of conceptual brain regions.
Finally, the fact that blind and sighted people activate the same brain regions during comprehension does not mean there are no changes in the microstructure of conceptual representations in blind individuals (for an example of such effects, see Connolly et al. 2007). Such changes might reflect differences in the kind of abstract information that is most readily accessible through vision, versus audition or touch. We hypothesize that these conceptual differences, when they exist, are similar in magnitude and in kind to subtle differences in concepts among sighted individuals who have different experiences and expertise. For example, biology professors have a more elaborated notion of “living” than laymen (Goldberg and Thompson-Schill 2009). Such differences need not imply that experts or blind individuals represent concepts in a different format (i.e., visual vs. auditory) or that different brain regions support their representations.
In summary, we find that action verb comprehension engages the same brain regions in congenitally blind and sighted individuals. Our data suggest that concepts retrieved during action verb comprehension are abstracted away from sensory–motor experiences and represented in a modality-independent format.
David and Lucille Packard Foundation (R.S.); National Institutes of Health (R01 MH067008 and R01 DC006842 to A.C. and A.P.L. and K24RR01887, R01 EY12091, and R21 EY0116168 to A.P.L.).
We thank our participants and the New England blind community for making this research possible. We would also like to thank Jonathan Raye for recording the words, and Lucy Chen, Ami Patel, John Scholz, David Feder, Talia Konkle, and the Athinoula A. Martinos Imaging Center for help with fMRI data collection and analyses. We would also like to thank Jorie Koster-Moeller for help with the linguistic analyses and Mike Frank for comments on earlier versions of this draft. Conflict of Interest: None declared.