The inhibition of speech acts is a critical aspect of human executive control over thought and action, but its neural underpinnings are poorly understood. Using functional magnetic resonance imaging and the stop-signal paradigm, we examined the neural correlates of speech control in comparison to manual motor control. Initiation of a verbal response activated left inferior frontal cortex (IFC: Broca's area). Successful inhibition of speech (naming of letters or pseudowords) engaged a region of right IFC (including pars opercularis and anterior insular cortex) as well as presupplementary motor area (pre-SMA); these regions were also activated by successful inhibition of a hand response (i.e., a button press). Moreover, the speed with which subjects inhibited their responses, stop-signal reaction time, was significantly correlated between speech and manual inhibition tasks. These findings suggest a functional dissociation of left and right IFC in initiating versus inhibiting vocal responses, and that manual responses and speech acts share a common inhibitory mechanism localized in the right IFC and pre-SMA.
Effective verbal communication requires fine control over speech acts, including the ability to inhibit initiated speech at almost any point in the production process (Ladefoged et al. 1973). Yet, it remains unclear how this process is implemented neurally. For more than a century and a half, evidence from lesion (Hillis et al. 2004; Gough et al. 2005), genetic (Lai et al. 2001; Belton et al. 2003; Liegeois et al. 2003), and functional imaging studies (Fiez et al. 1999; Poldrack et al. 1999) has strongly emphasized the critical role of the left inferior frontal cortex (IFC; BA44/45; also known as “Broca's area”), in speech production. From this perspective, it might be expected that the left IFC would also be involved in inhibiting speech. From another perspective, however, it might be expected that the right IFC could inhibit speech that is generated in the left IFC. This prediction derives from the observations that the right hemisphere “homologue” of Broca's area is involved in aspects of speech control, such as the timing of covert speech (Shergill et al. 2006) and right IFC overactivity has been implicated in stuttering (Fox et al. 1996) and the fact that the right IFC has also been implicated in the inhibition of manual motor responses (Aron et al. 2003; Buchsbaum et al. 2005; Aron and Poldrack 2006; Chambers et al. 2006; Garavan et al. 2006) and eye movements (Chikazoe et al. 2007). Importantly, damage of this region, but not its left hemisphere homologue, led to the impaired ability of subjects to cancel initiated manual responses when given a stop signal (Aron et al. 2003).
Most researchers now assume that language and the motor system are tightly associated with one other (Holden 2004). This is clearly manifested by the association of handedness and language asymmetry (e.g., right handedness associated with language dominance in the left hemisphere) (Knecht et al. 2000) as well as the fact that the same left IFC region involved in the control of speech production is also associated with various nonlanguage motor functions such as planning, recognition, and imitation of actions (Rizzolatti and Arbib 1998; Nishitani and Hari 2000; Heiser et al. 2003). In the present study, we aimed at extending these observations and further examining whether inhibition of speech acts involves the same right IFC region as manual response inhibition.
We deployed functional magnetic resonance imaging (fMRI) and the stop-signal paradigm (Logan and Cowan 1984) to explore the neural substrates of speech and manual response initiation and inhibition. Two spoken conditions were included which varied in the complexity of their linguistic processes. In the Letter Naming condition, subjects were to name the letter “T” or “D”; this condition aimed to closely match the stimulus in the manual condition (in which subjects responded with left and right button presses to those letters respectively) but is linguistically simple. In the pseudoword (PW) Naming condition, they were to name visually presented PWs (such as “haxp”). This condition is more linguistically complex than the Letter Naming condition and is thus more likely to engage left IFC regions (Fiez et al. 1999; Poldrack et al. 1999). The key analysis of interest was whether inhibiting an initiated speech act would activate right IFC (like manual stopping) or whether it would activate left IFC (Broca's area) instead. If we found common right hemisphere activation for stopping speech and manual responses, then that would suggest a common function in right IFC for behavioral control regardless of effector.
Materials and Methods
Fifteen healthy native English-speaking subjects participated in this study (6 males, 23.6 ± 6 years old, ranged from 18 to 39). All subjects had normal or corrected-to-normal vision and were right handed as judged by the Edinburgh Handedness Inventory (Oldfield 1971). They were free of neurological or psychiatric history and gave informed consent according to a procedure approved by the University of California, Los Angeles Human Subject Committee. Five additional subjects were removed from the analysis due to excessive head motion in the speech conditions.
The Stop-Signal Task
Three versions of stop-signal paradigm were used, namely, Manual, Letter Naming, and a PW Naming task (Supplementary Fig. S1a). Each version consisted of a number of Go trials and Stop trials. On Go trials, the subject responded as fast as possible to the visual stimulus presented on the screen. For the Manual task, subjects responded to the letters T or D with their right index or middle finger, respectively; for the Letter Naming task, subjects were to name the letters T or D; for the PW Naming task, subjects were to name PWs. On Stop trials (25% of trials), the subject attempted to stop his/her response when a stop signal (i.e., a “beep”) was sounded at a particular stop-signal delay (SSD) subsequent to the visual stimulus.
According to the race model (Logan and Cowan 1984; Boucher et al. 2007), Go and Stop processes in these tasks run independently and performance is characterized in terms of a race between these 2 processes. That is, whichever finishes first determines whether the response is executed or inhibited. The independence assumption implies that the distribution of Go processes on Stop trials (whether a response is made or not) is the same as the observed distribution of Go responses (when there is no Stop signal). When the SSD is short, the probability of inhibition [P(inhibit)] is high; when it is long, P(inhibit) is low. As a result, one can manipulate SSD to achieve a certain probability of successful inhibition. As SSD was varied to yield P(inhibit) ≈ 50%, the stop-signal reaction time (SSRT) was estimable by subtracting average SSD (also called the central SSD [SSDc]) from median reaction time (RT) of correct Go responses; SSRT provides a measure of the speed of the stopping process. This procedure is ideal for producing an estimate of SSRT that is relatively robust against any violations of independence between Go and Stop processes (Band et al. 2003). In the present study, a tracking version of the stop signal paradigm was used to achieve this purpose. Because it is hard to detect the vocal response against the background scanner noise, subjects participated in a behavioral test one day before the fMRI scan to estimate the SSD at which P(inhibit) = 50% (i.e., SSDc) under each task condition. SSDs for the fMRI scan were determined on the basis of these estimates to ensure that inhibition was equally challenging across the 3 versions.
Prescan Behavioral Test
For all the 3 versions, the basic paradigm was the following. On Go trials, each trial started with a white fixation point (i.e., crosshair) appearing in the center of the black background screen. After 500 ms, a white letter (T or D) (or a pseudoword in PW Naming condition) appeared. The letter T appeared on half the trials and the letter D appeared on the other half. The order of “T” and “D” was randomized. The PWs were never repeated. The letter or PW remained on the screen until subjects made a response or after a 1-s delay, whichever occurred first. The next trial started after a 1-s interval. A Stop trial was identical to a Go trial in all respects except that a tone (900 Hz, duration 500 ms) was played at some delay after the stimulus. If the subject inhibited their response, then the stimulus remained on screen for the duration of 1 s. If the subject responded, then the stimulus disappeared. The next trial started after a 1-s delay. SSD changed dynamically throughout the experiment, depending on the subject's behavior. If the subject inhibited successfully on a Stop trial, then inhibition was made less likely on a subsequent Stop trial by increasing the SSD by 50 ms; if the subject did not successfully inhibit, then inhibition was made more likely by decreasing the SSD by 50 ms. Four step-up and step-down algorithms (staircases) were employed in this way to ensure convergence to P (inhibit) of 50% by the end of the experiment (Supplementary Fig. S1b). The 4 staircases started with SSD values of 100, 150, 200, and 250 ms, respectively. For each condition, there were 240 Go trials and 80 Stop trials. Each staircase therefore moved 20 times. The staircases were independent, but randomly interleaved, that is, each particular Stop trial belonged to one particular staircase, but the order of staircases was random trial-by-trial.
Before the test, it was made clear to subjects that stopping and going were equally important and that it would not always be possible to stop. Subjects responded with their right hands on a computer keyboard for the Manual tasks. Vocal responses in the Letter and PW Naming tasks were collected through a microphone, which was connected to an in-house built voice key (http://white.stanford.edu/hardware/voicekey) that can be used to measure voice onset time. Stop tones were played through headphones at a level comfortable to the subject. Each scan was preceded by an instruction screen with a reminder to the subject: “Remember, respond as FAST as you can. However, if you hear a beep, your task is to STOP yourself from responding. Stopping and Going are equally important.” After every 64 trials, subjects took a short break and were given feedback in the form of median correct RT and number of discrimination errors on Go trials (no error feedback was given in the Letter Naming and PW Naming task). The order of conditions was counterbalanced across subjects.
Central SSD was computed, for each subject, from the values of the 4 staircases after the subject had converged on 50% P(inhibit). Values for the last 10 moves of each staircase were averaged to give a stable SSD estimate. In case a staircase did not converge (which was rare), it was removed from analysis and the SSDs from the remaining staircases were averaged to estimate the SSDc. As mentioned above, SSRT was estimable by subtracting SSDc from median correct Go RT. One SSD value (SSDc) was calculated for each task version and for each subject and was then used for the fMRI test.
Behavior in the fMRI Session
Overall, the basic paradigm used in the fMRI test was very similar to that in the behavioral test except for 3 major differences. First, we used custom Matlab code to select sequences of Go, Stop, and null events and to select the distribution of null time in a way that optimized the detection of hemodynamic responses for the critical contrast of Stop and Go events. Null events were imposed between every Stop or Go trial. The duration of null time ranged between 0.5 and 4 s (mean 1 s, sampled from an exponential distribution truncated at 4 s). A large number of sequences were generated within these constraints and the sequences with the highest efficiency to detect differences between Go and Stop events were selected (Liu et al. 2001). There were 32 Stop and 96 Go trials per scan (128 trials total). There were 2 scans for each condition (Manual, Letter Naming, and PW Naming). Second, the SSD in the fMRI test was generated according to the SSDc estimated in the behavioral test in the following way. In each block of trials, there were 32 Stop trials with 8 SSD values taken from the 4 different levels: SSDc − 60 ms, SSDc − 20, SSDc + 20, and SSDc + 60). Third, the stimuli remained on the screen for 1 s irrespective of the subjects’ response because scanner noise precluded online detection of subject responses.
Subjects made their manual responses via a MRI-compatible button box and were recorded by the computer. An observed SSRT (SSRTobs) was estimated for each SSD using the race model. For example, if for a given SSD, the response rate is 30%, then the corresponding SSRTobs for this SSD is 30% percentile of Go RT of nonstop trials minus the SSD. These SSRTobs were then averaged to obtain a single estimation of SSRT. During Letter Naming and PW Naming tasks, vocal responses and background scanner noise were recorded via a MRI-compatible microphone. They were subsequently denoised using Cool Edit (Syntrillium Software, 2001) to allow accurate detection of the presence of responses on Go and Stop trials (Supplementary Fig. S1c). However, this method cannot provide an accurate estimation of the vocal onset. We thus did not calculate the Go RT and SSRT for Letter Naming and Manual task in the scanner. However, results from the Manual task indicated the behavioral performance in the behavioral test and fMRI was comparable (see Results).
MRI Data Acquisition
Image data were collected using a 3T Siemens Allegra MRI scanner. For each run, 182 functional T2*-weighted echo-planar images (EPI) were acquired with the following parameters: slice thickness = 4 mm, 33 slices, time repetition [TR] = 2 s, time echo [TE] = 30 ms, flip angle = 90°, matrix 64 × 64, field of view [FOV] = 200). Additionally, a T2-weighted matched-bandwidth high-resolution anatomical scan (same slice prescription as EPI) and MPRAGE were acquired. The parameters for MPRAGE were: TR = 2.3, TE = 2.1, FOV = 256, matrix = 192 × 192, saggital plane, slice thickness = 1mm, 160 slices. Stimulus presentation and timing of all stimuli and response events was achieved using Matlab (Mathworks) and the Psychtoolbox (www.psychtoolbox.org) on an IBM laptop.
Imaging Preprocessing and Statistical Analysis
Initial analysis was carried out using tools from the FMRIB software library (www.fmrib.ox.ac.uk/fsl) version 3.3. The first 2 volumes in each time series were discarded to allow for T1 equilibrium effects. The remaining images were then realigned to compensate for small head movements (Jenkinson and Smith 2001). Translational movement parameters never exceeded 1 voxel in any direction for any subject or session. All images were denoised using MELODIC independent components analysis within FSL (Tohka et al Forthcoming); an average of 16.6 components were removed from each scanning run (range: 3–35). Data were spatially smoothed using a 5-mm full-width-half-maximum Gaussian kernel. The data were filtered in the temporal domain using a nonlinear high-pass filter with a 66-s cutoff. A 3-step registration procedure was used whereby EPI images were first registered to the matched-bandwidth high-resolution scan, then to the MPRAGE structural image, and finally into standard (Montreal Neurological Institute [MNI]) space, using affine transformations with FLIRT (Jenkinson and Smith 2001) to the avg152 T1 MNI template.
There were 2 variations of model fitting: 1) Standard analysis for all tasks: The following events were modeled after convolution with a canonical hemodynamic response function: Go, StopInhibit, StopFail, and nuisance events consisting of Go trials on which subjects did not respond or made errors. Events were modeled at the time of the letter or word stimulus. Temporal derivatives and the 6 motion parameters were included as covariates of no interest to improve statistical sensitivity. Null events were not explicitly modeled, and therefore constituted an implicit baseline. For each subject, and each scan, the following 4 contrast images were computed: Go–null, StopInhibit–null; StopFail–null; StopInhibit–Go. 2) For the PW Naming task, we performed a further analysis that split the StopInhibit trials into Early_Inhibit (i.e., SSD was short than SSDc) and Late_Inhibit (i.e., SSD was longer than SSDc) trials. This analysis allowed examination of the difference in activation when inhibition occurred early (i.e., soon after the Go process was initiated) compared with when it occurred late.
A second-level analysis was performed to average across scanning runs on each task for each subject, using FLAME (FMRIB's Local Analysis of Mixed Effect) stage 1 only with between-runs variance pooled across subjects (Beckmann et al. 2003; Woolrich et al. 2004). These data were then analyzed using a mixed-effects model (treating subjects as a random effect) with FLAME stage 1 only. Unless otherwise indicated, group images were thresholded with a height threshold of z > 2.3 and a cluster probability, P < 0.05, corrected for whole-brain multiple comparisons using Gaussian random field theory. For analyses with specific anatomical hypotheses (i.e., activation of right IFC in StopInhibit–Go [PW Naming] and left IFC activation in Go [PW Naming > Letter Naming]), maps were corrected using the adaptation of Gaussian random field theory for small volumes, which were anatomically defined according to an anatomical atlas (Tzourio-Mazoyer et al. 2002). The right IFC included right pars opercularis and anterior insular cortex, whereas the left IFC included the left pars opercularis and adjacent ventral precentral gyrus.
Conjunction analysis was performed for the contrast of StopInhibit versus Go across 3 task versions, using the procedure suggested by Nichols et al. (2005). Accordingly, groups maps for the contrast for StopInhibit–Go for each condition of Letter Naming, PW Naming, and Manual were thresholded individually at z = 2.3 (uncorrected at the voxel level), binarized, and multiplied—thus revealing brain regions that were significantly activated by response suppression for all 3 tasks. To confirm that the common right IFC activation was not a false positive, we did a further conjunction analysis based on the small volume correction-corrected map for each task, using the same approach. This analysis revealed the same cluster in the right IFC and anterior insula region. A conjunction analysis was also performed for PW Naming (Go) > Letter Naming (Go) and PW Naming (Go) > Manual (Go). This analysis revealed brain regions that were specified for initiation of PW naming response.
Regions of Interest Analysis
Regions of Interest (ROIs) were defined to quantify the degree of activation. Two ROIs were defined: 1) the right IFC, which represented the common response inhibition center, was defined as the region surviving the conjunction analysis within the anatomical boundary of right pars opercularis and adjacent insular cortex and 2) the left IFC, which represented the region for speech initiation, was defined as the region survived the conjunction analysis of contrast PW Naming (Go) > Letter Naming (Go) and PW Naming (Go) > Manual (Go) within the anatomical boundary of the left pars opercularis and adjacent ventral precentral gyrus. The left homologue of the right IFC was also defined to examine the laterality of response inhibition. For ROI analyses, the mean effect size (i.e., COPE) was extracted for each subject and for each contrast and was then used for further statistical analysis.
Quantification of Head Motion in the Scanner
Prescan and Scanning Behavioral Results
Before scanning, a tracking version (Supplementary Fig. S1b) of the stop-signal paradigm was used to estimate, for each subject and condition, the average SSD (SSDc) that yielded 50% StopInhibit trials (i.e., Stop trials without a response). This allowed an estimate of the speed with which each subject stopped their response in each condition (SSRT) (Logan and Cowan 1984) as well as the establishment of key fixed SSD parameters for the scanning experiment. Naming PWs (i.e., Going) was significantly slower than both Manual responses (t14 = 9.52, P < 0.001), and Letter Naming (t14 = 7.26, P < 0.001). SSRT (i.e., Stopping) was faster for Letter Naming responses than both Manual responses (t14 = −2.01, P = 0.06) and for PW Naming responses (t14 = −2.56, P = 0.02) (Fig. 1a). Importantly, across subjects, SSRT for the 3 tasks was significantly correlated (Manual and Letter Naming, r = 0.57, P = 0.027; Manual and PW Naming, r = 0.55, P = 0.033; and Letter Naming and PW Naming, r = 0.68, P = 0.005) (Fig. 1b), suggesting that response inhibition in these tasks involves a common cognitive process.
During the scan, a set of 4 fixed SSD values were used for each subject, based on the prescan behavioral test (i.e., SSDc − 60 ms, SSDc − 20 ms, SSDc + 20 ms, SSDc + 60 ms). Vocal responses and background scanner noise were recorded. Subsequent removal of scanner noise allowed accurate detection of whether or not a vocal response was made and thus the discrimination of Go trials, StopFail trial, and StopInhibit trials (Supplementary Fig. S1c). Though this procedure did not allow us to accurately estimate the Go reaction time (and therefore SSRT) for Letter Naming and PW Naming task, a comparison of prescan and scan behavioral performance for the Manual task found that Go and SSRT times were highly correlated across the pretest and scanning session (Go RT: r = 0.77, P = 0.001; SSRT: r = 0.68, P = 0.005; also see Supplementary Fig. S2). As expected, stop likelihood decreased as SSD increased (F3,42 = 66.10, P < 0.001) (Supplementary Fig. S3).
Go Response Activated the Frontostriatal Direct Loop
For all the 3 tasks, the execution of Go responses significantly activated frontal/basal ganglia circuitry consistent with a “direct” cortical–striatal–pallidal–thalamic–cortical pathway for response initiation (Mink 1996) (Fig. 2). For Letter and PW Naming tasks, there was activation in the supplementary motor area (SMA), bilateral precentral gyrus, putamen, pallidum, and thalamus. For the Manual task, there was activation in the SMA as well as putamen and left primary motor cortex (M1). Although activation in pallidum and thalamus has been observed in prior studies of manual stopping (Aron and Poldrack 2006), it did not reach significance at the corrected threshold in the present study. The difference between Manual and vocal tasks in motor/premotor cortex was confirmed by direct comparison using the contrast of Manual versus Letter Naming + PW Naming (Manual > Letter Naming + PW Naming— M1: −44, −28, 56, Z = 4.58; Letter Naming + PW Naming > Manual— Left precentral gyrus: −46, −14, 38, Z = 6.24; Right precentral gyrus: 58, −10, 28, Z = 6.07). Bilateral superior temporal gyri (STG) were also activated in Letter Naming and PW Naming tasks, presumably due to subjects hearing their own speech (Letter Naming + PW Naming > Manual— Left STG: −52, −18, 8, Z = 5.76; Right STG, 66, −20, 4, Z = 6.23). As expected, PW Naming elicited additional activation in the left IFC. A direct comparison between PW Naming and Letter Naming condition confirmed this result (MNI: −50, 10, 12, Z = 2.84; small volume corrected over the search volume of left par opercularis region), consistent with the role of this region in initiating and planning complex speech responses (Hillis et al. 2004; Gough et al. 2005).
Response Inhibition Activated a Common Right IFC Region
StopInhibit trials include an already initiated Go process with a subsequent Stop process. To isolate the neural correlates specific to successful stopping, we directly contrasted StopInhibit and Go trials. This analysis revealed strong right IFC activation, especially in the pars opercularis/insular cortex region, for the inhibition of Manual (MNI: 44, 18, 4, Z = 4.07), Letter Naming (MNI: 46, 18, 8, Z = 3.32), and PW Naming responses (MNI: 42, 16, 6, Z = 3.23) (Fig. 3a,b,c; also see Supplementary Table 1 for a complete list of foci activation in this contrast). There was also strong activation of bilateral STG for the Manual task, presumably reflecting the presence of the auditory stop signal on StopInhibit trials. This activation may have been subtracted out in Letter Naming and PW Naming conditions due to the presence of self-generated speech on Go trials.
To identify whether the right IFC was activated across all the 3 conditions, we performed a conjunction analysis (Nichols et al. 2005), which identified areas of overlap between thresholded group statistical maps for StopInhibit–Go for all the 3 conditions (see Materials and Methods). This analysis revealed a common activation for all 3 tasks in the right IFC region (Fig. 3d,e) among other regions (Fig. 3d, Supplementary Table 2). But there is no activation in the left frontal lobe, and this functional asymmetry is confirmed by comparing the right IFC activation with its left homologue (F1,14 = 79.72, P < 0.001; no task by hemisphere interaction: F2,28 = 1, P = 0.39) (Fig. 3f).
Although the main focus of the present study is the right IFC region, there is cumulative evidence suggesting the involvement of presupplementary motor area (pre-SMA) in response control (Aron and Poldrack 2006; Floden and Stuss 2006; Stuphorn and Schall 2006; Aron et al. 2007; Isoda and Hikosaka 2007b; Mostofsky et al. 2003). Consistent with these studies, we also found common activation in pre-SMA for 3 tasks (Fig. 3a,b,c; Supplementary Table 1), and conjunction analysis also revealed a common activation in this region (Figs 3d and 4a; Supplementary Table 2), though this result did not survive a corrected threshold.
Based on previous high-resolution results implicating the subthalamic nucleus (STN) region in inhibition of manual motor responses (Aron and Poldrack 2006), we also assessed whether the STN region was activated during response suppression across the 3 tasks. Although we found significant activation for the Manual task in a region encompassing portions of STN and thalamus (with the center of the volume of activation within thalamus MNI: 8, −10, 4; Z = 3.03), it is difficult to specifically localize this activity due to the spatial resolution of the fMRI data. However, no activation was found for Letter Naming or PW Naming task in that area (Fig. 4b), and an ANOVA on the signal from this region showed significant difference among tasks (F2,13 = 3.45, P = 0.039). A post hoc test indicated that the activation was stronger for Manual than for Letter Naming + PW Naming (t14 = 3.04, P = 0.009).
Functional Interaction of Left and Right IFC in Speech Control
The foregoing analyses revealed 2 frontal regions, namely, the left and right IFC, that were differentially involved in initiating and inhibiting the speech response. This dissociation is more clearly shown by examining their interaction in different aspects of the PW Naming task alone. To examine how activation of the left and right IFC is modulated by whether or not inhibition is successful and also by the relative timing of the SSD, we separated the StopInhibit trials into Early_Inhibit (i.e., SSD was shorter than SSDc) and Late_Inhibit (i.e., SSD was longer than SSDc) trials and then compared activation for Go, StopFail, Early_Inhibit, and Late_Inhibit trials in both hemispheres. In the right hemisphere that represented a common response inhibition center, activation was averaged in each condition for each subject within the right IFC region at which there was a conjunction for inhibition across tasks (Fig. 5a). In the left hemisphere that was involved in speech initiation, activation was averaged in each condition for each subject within the left IFC region, where PW Naming was greater than manual and letter reading. When including the Go condition and all stop conditions, there was a significant trial by hemisphere interaction (F3,42 = 24.93, P < 0.001, Fig. 5b). This reflects the fact that the right IFC was only engaged on trials when inhibition (including unsuccessful inhibition) was engaged, whereas the left IFC was engaged under all task conditions, though to a less extent in the StopInhibit condition. This is consistent with the idea that the left IFC implements the Go process for speech production and that the Go process is either blocked or diminished on Stop trials; alternatively, it could reflect weaker initiation of the production process on trials in which the subject is subsequently able to stop. An examination of the 3 stop conditions (i.e., Early_StopInhibit, Late_StopInhibit, and StopFail), indicates that left and right IFC showed similar variation across stop conditions (Task effect: F = 2.73, P = 0.08) but no stop condition by hemisphere interaction: F < 1. This may suggest that when the Go process has not been fully initiated and implemented, less neural resources are required to inhibit it.
We found that whereas the left IFC was strongly activated for naming PWs (more so than naming letters or making manual responses), successful inhibition of speech activated the same region of the right IFC opercular/insula region as successful inhibition of a hand response (i.e., a button press). These results provide new insights into the functional relevance of right IFC in language processing. Whereas our results are consistent with a long line of work implicating the left IFC in the programming and execution of speech acts, they also demonstrate that the right IFC plays a corresponding role in the inhibition of speech acts. These results provide novel constraints on models of hemispheric specialization in motor control (Serrien et al. 2006) and serve as a bridge between manual and speech motor control systems. The inhibitory view of right IFC and its involvement of inhibition of speech acts provides a basis for better understanding the functional relevance of right IFC involvement in speech production (Shergill et al. 2006). In addition, the present results help to interpret the finding that overactivity of the right IFC is associated with speech production disorders such as stuttering (Fox et al. 1996)—such overactivation could relate to an overactive stopping process which may inappropriately brake speech output.
The present results also provide strong evidence for a common inhibitory mechanism between manual and speech acts. First, this region was commonly engaged for StopInhibit–Go conditions in each of the speech and Manual tasks. Second, behavioral measures of response inhibition (SSRT) were correlated between the speech and manual tasks. The right hemisphere dominance in inhibition of bilateral larynx movements in speech extends previous results beyond the domain of manual (Aron and Poldrack 2005; Buchsbaum et al. 2005; Garavan et al. 2006) and oculomotor responses (Heinen et al. 2006; Chikazoe et al. 2007; Hodgson et al. 2007). It should be noted that the right lateralization of IFC in manual response inhibition is not an artifact of the use of the right hand in manual response tasks (Konishi et al. 1998, 1999). In these studies, subjects inhibited with either right or left hand, but activation was strongly right lateralized. Also, the transcranial magnetic stimulation (TMS) on the right IFC produced elongated SSRT regardless of which hand subjects were using (Chambers et al. 2006). Our study further suggests that the right IFC activation is not limited by hand response itself.
The finding of significant right IFC activation related to stopping of both manual and vocal responses is very unlikely to be explained by attentional- or stimulus-processing effects. Lesion studies have shown that the right IFC is critical for stop-signal response inhibition (Aron et al. 2003), task switching (Aron et al. 2004), attentional interference control (Michael et al. 2006), oculomotor rule switching (Hodgson et al. 2007), overcoming response perseveration (Clark et al. 2007), and the correction of automatic response errors (Walker et al. 1998; Hodgson et al. 2007)—all ostensibly related to a failure of inhibitory control which is unconfounded by the “oddball” effect; yet, all these paradigms call for control over irrelevant response tendencies (or currently incorrect stimulus-response mappings) and could well tap a common inhibitory mechanism.
There are several other pieces of important information, which suggest that the right IFC response is not merely related to stimulus-driven attention (or perceptual) factors. In the related Go/No-Go paradigm, robust right IFC activation was found for response inhibition even when controlling for oddball frequency (Chikazoe et al. 2007; also see Heinen et al. 2006). Further, neurophysiological recording studies in the lateral PFC have demonstrated responses that are specific to the meaning of the No-Go cue rather than to just the stimulus properties (Sakagami et al. 2001). Microstimulation studies in human subjects of the IFC have also led to its description as a “negative” motor area—that is, one in which stimulation leads to cancellation or suppression of voluntary motor acts (Luders et al. 1988). Finally, a recent EEG study found a strong right frontal N200 event--related potential component for successful stopping (consistent with several other such reports) but not for a stop irrelevant condition, where a stop signal occurred but was to be ignored (Schmajuk et al. 2006).
Although further studies are required to directly rule out such confounding effects (such as the oddball effect of infrequent Stop trials) in the speech condition, as has been previously done for manual and oculomotor conditions (e.g., Heinen et al. 2006; Chikazoe et al. 2007), we think, for several reasons that the commonality in activation between speech and manual conditions suggests that the speech results would be similar to those for manual stopping. First, we used the same task paradigms for all the 3 conditions and tested them on the same group of subjects. Particularly, the Manual and Letter Naming conditions were exactly matched in all the aspects except the response modality. The common IFC activation for the contrast of StopInhibit–Go in all the 3 conditions thus should reflect common mechanisms. Second, along with the overlapping activation, the SSRT for all the 3 tasks were highly correlated, suggesting a common inhibitory mechanism attributed to the right IFC (as in the manual condition). Third, the results shown in Figure 5 suggest that the right IFC was actually related to the inhibition effort, but not affected by the success or not of the inhibition. Finally, though speech inhibition has not been widely studied, there is evidence that the right IFC is involved in the inhibition of speech response by microstimulation in epilepsy patients (in which context it has been described as a “negative motor area” for speech, i.e., one in which stimulation produces speech arrest [Luders et al. 1988]).
In the current study, the common right IFC activation across tasks was located in the frontal operculum region and it extended to the dorsal anterior insula region, that is, the anterior dysgranular insula. Our earlier study with frontal lobe patients (Aron et al. 2003) found that the strongest relationship between SSRT and damage was for the pars opercularis region (roughly BA 44) (Although this was originally mistakenly reported as ‘pars triangularis’). A study using TMS also found that stimulation over the pars opercularis region disrupted SSRT (Chambers et al. 2006). Nevertheless, it is still unclear which exact part of the IFC (broadly construed) is critical for inhibiting initiated responses (speech, vocal, or oculomotor). It is indeed possible (even likely) that lesions (in the prior study by Aron et al. 2003) affected the anterior insular region, and it is possible that TMS disrupted this region too.
In addition to the lesion and TMS studies implicating pars opercularis (and possibly adjacent regions) in stop-signal response inhibition, many studies have implicated ventral sectors of the monkey lateral prefrontal cortex in Go/No-Go response inhibition. Excisions of the prefrontal convexity disrupt response inhibition (Iversen and Mishkin 1970) and neurophysiological recordings from a similar region show strong responses for No-Go trials (Sakagami et al. 2001). Microstimulation of the prefrontal convexity leads the monkey to cancel (suppress) a Go trial (Sasaki et al. 1989), and similar responses have been seen in humans (Luders et al. 1988). Together, all these pieces of evidence strongly implicate lateral prefrontal cortex itself in response inhibition. The activation we see within the frontal operculum in the current study is consistent with this picture and also consistent with our previous study using the stop-signal task (Aron and Poldrack 2006) (and also see [Rubia et al. 2003]) that showed activation in the pars opercularis region, extending into the frontal operculum and strongly into the anterior insula. Moreover, the fact that there is overlap in this right frontal operculum (and surrounding regions) for both stopping manual and vocal responses is good evidence that there is a common mechanism for control within this region that supercedes effector. Indeed, recent lesion and imaging evidence suggests that eye control also depends on a region including the frontal operculum (Chikazoe et al. 2007; Hodgson et al. 2007).
The activation in the current study also extended into the dorsal anterior insula region. In addition to our prior study with the stop-signal task (Aron and Poldrack 2006), it is noteworthy that activation of insula in fMRI studies of response inhibition and attentional shifting has been frequently reported (see meta-analysis by Wager et al. 2004, 2005). Anatomically, the dorsal portion of the anterior insula is dysgranular, with incomplete laminar structure and a cytoarchitectural appearance intermediate between agranular paleocortex and fully developed neocortex. It blends into the fully laminated frontal operculum and shares a similar pattern of connectivity as the frontal operculum region (Mesulam and Elliott 1982). Thus, the dorsal anterior insula has been considered an extension of the frontal operculum and is functionally different from the ventral agranular and the posterior dysgranular (mid-insula) region which are involved in emotion and pain (for a review, see Wager et al. 2004). The proximity of the dorsal anterior insula to agranular insula and its interposition between this older structure and the lateral prefrontal cortex suggest that dorsal anterior insula may play an interface role between the autonomic system and cognitive control. Whether such a role is in fact critical for stopping as such or represents the consequences (for central representations of body state) of stopping is an interesting avenue for further investigation, perhaps via lesion studies or high-resolution imaging.
Although lesion results have shown that the pre-SMA is involved in speech initiation (Ziegler et al. 1997), cumulative evidence has suggested that the pre-SMA is, in many respects, more like prefrontal areas than motor areas (Picard and Strick 2001; Johansen-Berg et al. 2004). Diffusion tensor tractography has shown that the pre-SMA has direct connections with right IFC (Johansen-Berg et al. 2004; Aron et al. 2007). A number of studies have shown that the SMA proper is related to motor output for both manual and speech responses, whereas the pre-SMA is associated with motor control (see Picard and Strick 2001 for a review). In particular, this region is thought to be a “negative motor area”; for example, microstimulation in human patients being evaluated for epilepsy has established that stimulation of a more anterior region of SMA (i.e., pre-SMA in more recent terminology) produces speech as well as manual motor arrest (Penfield and Welch 1951; Luders et al. 1988; Fried et al. 1991). In speech production tasks, the pre-SMA is more involved in word selection, whereas the SMA proper is involved in speech output (Alario et al. 2006). Our data are consistent with this pattern as we found strong SMA activation for Go condition in all the 3 tasks, whereas the pre-SMA was commonly involved in the inhibition of the manual and speech response. The pre-SMA activation is also consistent with prior functional imaging and lesion studies of the stop-signal paradigm (Aron and Poldrack 2006; Floden and Stuss 2006; Aron et al. 2007) as well as monkey physiological studies (Stuphorn and Schall 2006; Isoda and Hikosaka 2007b). Common pre-SMA activation has also been found in inhibiting both manual and oculomotor response (Leung and Cai 2007).
It should be noted that in a positron emission tomography (PET) study, Paus et al. (1993) asked subjects to reverse an overlearned response in 3 tasks, manual, speech, and oculomotor, and they found that different subregions of anterior cingulate cortex (ACC) were involved in the 3 tasks. Beside the differences in experimental techniques (PET vs. fMRI) and research focuses (ACC vs. pre-SMA and right IFC), we think this dissociation might be attributed to the fact that the 3 studies have examined different aspects of executive control, that is, response reversal and response inhibition. As shown by their study, the involvement of ACC in response output was particularly evident when there was a conflict (e.g., response reversal) and the different ACC activation was determined by the somatotopic organization. In contrast, the present study examined response inhibition, where no response output was required in the Stop trials. It is thus of particular interest for future studies to examine different aspects of response control and how it is modulated by the involvement of different effectors.
The current study identified activation for the STN region for stopping manual responses, consistent with an earlier high-resolution fMRI finding (Aron and Poldrack 2006) and with several forms of adjunct evidence implicating the STN in stop-signal or No-Go response inhibition (Kuhn et al. 2004; van den Wildenberg et al. 2006; Eagle et al. 2007). However, activation was not detected in the STN region for the speech conditions. This could relate to the fact that there was increased head motion induced by overt speaking, which would reduce the power to detect real differences by smearing activation across adjacent, non-STN, voxels. Alternatively, the STN region may not be recruited by the requirement to stop speech in the way it is apparently recruited to stop manual responses. However, a recent report has also implicated the STN in the control of eye movements (Isoda and Hikosaka 2007a) and we expect that the putative 3-way functional-anatomic network between pre-SMA, the right IFC, and the STN (Aron et al. 2007) would indeed have a general function that extends to speech control too. We note again that the current study shows activation of right pre-SMA and right IFC by speech control, and it has been shown that microstimulation of both these areas in human subjects can produce speech arrest (Luders et al. 1988). Future studies, for example those stimulating the STN in DBS patients, might establish more clearly whether the STN is involved speech control as well.
It is interesting to note that the current study also revealed that the degree of activation in left IFC (i.e., Broca's area) decreased parametrically from Go trials to Stop fail trials to late-inhibited and early-inhibited Stop trials. As left IFC initiates the vocal response, this suggests Go process can be either blocked or diminished on Stop trials. Alternatively, it could reflect weaker initiation of the production process on trials in which the subject is subsequently able to stop. However, we think the latter explanation is unable to explain the overall pattern obtained in this study. That is, if left IFC activation was not blocked, there should be equal left IFC activation for Go trials and the average of all the Stop trials because Go and Stop trials were independent. There should also be stronger left IFC activation for the StopFail trials than for the Go trials because the Go response for StopFail trials was the strongest. However, we actually found stronger left IFC activation for the Go trials than the StopFail trials and the activation for Go trials was also larger than the mean of the all Stop trials. Our data are consistent with the former explanation (i.e., Go process can be either blocked or diminished on Stop trials) and with a speech production model that suggests speech production is not an all-in-one process (Levelt et al. 1999). Nevertheless, it is not clear whether the right IFC directly inhibits the Go processing in left IFC or the Go process is canceled further downstream (e.g., STN), which leads, via feedback, to a diminution of activation of Broca's area.
Our results have significant implications for further elucidation of right IFC function in language processing. For example, though right IFC overactivation during speech production has been consistent revealed in patients who stutter (Brown et al. 2005), it is unclear whether this reflects compensatory activity helping to recover speech fluency or in fact inhibits and interferes with speech further. The involvement of right IFC in inhibition seems to be consistent with the latter explanation. The speech inhibition paradigm developed in this study is ready to be applied to stutterers to examine their right-hemisphere function. Moreover, the differential role of the 2 hemispheres in initiation and inhibition is also consistent with the observation that the right IFC compensatory potential for language is limited and less effective than in patients who recover left IFG function (Winhuisen et al. 2005). Long-term right TMS over the right IFC has been shown to improve the recovery of naming performance in nonfluent aphasia patients (Naeser et al. 2005).
In sum, our results reveal that right IFC plays a functionally distinct and important role in the control of speech. They challenge the prevailing view that places execution and control of speech processes primarily in the left hemisphere. The common activation of right IFC and pre-SMA in inhibiting both speech and manual responses, the correlation in stopping speed across manual and vocal conditions, and other findings from oculomotor control (Chikazoe et al. 2007) and switching (Konishi et al. 1999; Swainson et al. 2003; Aron et al. 2004) provide strong evidence to suggest the presence of a domain-general response inhibition mechanism that relies upon the right IFC. These results provide further evidence that human language processes are built from basic neural building blocks that may play very general roles in cognition.
James S. McDonnell Foundation 21st Century Science Program Grant (to R.P); Foundation for Psychological Research, University of California-Los Angeles Center for Culture, Brain and Development (to G.X.).
We thank Sabrina Tom for help with data analysis. Conflict of Interest: None declared.