The present study establishes an electrophysiological index of lexical access in speech production by exploring the locus of the frequency and cognate effects during overt naming. We conducted 2 event-related potential (ERP) studies with 16 Spanish–Catalan bilinguals performing a picture naming task in Spanish (L1) and 16 Catalan–Spanish bilinguals performing a picture naming task in Spanish (L2). Behavioral results showed a clear frequency effect and an interaction between frequency and cognate status. The ERP elicited during the production of high-frequency words diverged from the low-frequency ERP between 150 and 200 ms post-target presentation and kept diverging until voice onset. The same results were obtained when comparing cognate and noncognate conditions. Positive correlations were observed between naming latencies and mean amplitude of the P2 component following the divergence, for both the lexical frequency and the cognate effects. We conclude that lexical access during picture naming begins approximately 180 ms after picture presentation. Furthermore, these results offer direct electrophysiological evidence for an early influence of frequency and cognate status in speech production. The theoretical implications of these findings for models of speech production are discussed.
The ease with which we produce speech may lead us to think that the cognitive and brain mechanisms put at play in this skill are rather simple. However, speech production is a complex process which entails the orchestration of many processes that unfold over time (e.g., Dell 1986; Caramazza 1997; Levelt et al. 1999). In recent years, the amount of psycholinguistic experimental research exploring these processes has increased, leading to more detailed models of speech production. However, the same cannot be said regarding the investigation of the time course of the neural events underpinning these processes. The present article aims at helping to fill this gap by exploring the electrophysiological correlates of 2 robust psycholinguistic phenomena in picture naming; the frequency and the cognate effect.
Cognitive models of single word production assume that the translation of our communicative goal into speech occurs at various levels of representation. Speaking probably involves at least 1) the retrieval of conceptual information, 2) the selection of the words corresponding to the intended message, 3) the phonological encoding of the selected words, and 4) the retrieval of the articulatory plans (e.g., Dell 1986; Caramazza 1997; Levelt et al. 1999). Although there is still a debate regarding how these processes unfold over time, researchers generally agree on the existence of some sequentiality. That is, concepts are retrieved before words are selected and these in turn are selected before their corresponding phonemes are retrieved. Given this general functional architecture, it is relevant to describe not only the neural structures implicated in the representation of different types of information but also the time course of their involvement.
The time course of lexical access in speech production has been studied using a variety of chronometric tasks (e.g., Schriefers et al. 1990; Dell and O'Seaghdha 1991; Wheeldon and Levelt 1995). These studies have provided evidence for the hypothesis of sequentiality in speech production by showing that a word's conceptual/semantic and syntactic properties are retrieved before its phonological form becomes available. Recently, event-related potential (ERP) studies have supported such a sequence of processing in speech production (e.g., Van Turennout et al. 1997, 1998; Schmitt et al. 2000; Jescheniak et al. 2002; Rodriguez-Fornells et al. 2002; Schiller et al. 2003a). These studies used the Lateralized Readiness Potential (LRP; e.g., Coles, 1989; Miller et al. 1992) and the N200 (a component supposed to reflect response inhibition; e.g., Pfefferbaum et al. 1985; Kok 1986; Eimer 1993) in linguistic go/no-go tasks in order to obtain precise measurements of the temporal distance between different stages of speech production. LRP and N200 data indicate that conceptual activation unfolds during the first 150 ms of processing (e.g., Thorpe et al. 1996; see also Johnson and Olshausen 2005; Hauk et al. 2007), lexico-semantic information is processed 90–120 ms before phonological information (e.g., Van Turennout et al. 1997; Schmitt et al. 2000), and syntactic information is retrieved approximately 40 ms before phonological information (e.g., Van Turennout et al. 1998).
In an interesting meta-analysis, Indefrey and Levelt (2004) integrated results of such chronometric ERP studies with the time course of neural activation revealed by magnetoencephalography (MEG) in overt picture naming tasks (Salmelin et al. 1994; Levelt et al. 1998; Maess et al. 2002). According to this meta-analysis and relative to picture onset (time 0), 1) lexical access (understood as lemma retrieval) is estimated to start as early as 150 ms and reach completion at around 275 ms, 2) phonological encoding is thought to take place between 275 and 400 ms; and 3) syllabification is estimated to unfold between 400 and 600 ms.
However, the chronometric interpretation proposed by Indefrey and Levelt's (2004) meta-analysis may be affected by the fact that the tasks that participants performed in the ERP studies cited above differed markedly from those used in the MEG studies considered for comparison. In order to avoid effects of speech-related motor artifacts (e.g., Wohlert 1993; Masaki et al. 2001) none of the ERP experiments actually involved speech production, but rather consisted of button press responses. Furthermore, the tasks were complex and difficult (go/no-go decision, combined with left/right button press decision, followed by production of a pronoun sentence comprising the stimuli), and implied meta-linguistic judgments by the participants (e.g., “if the word refers to an animal, then press the right button if the animal name starts with the letter b or press the left button if it starts with the letter s”). It cannot be excluded that such conditions may have triggered additional cognitive processes affecting the timing of the key processes involved in natural speech production. Also, the time course proposed by models of picture naming is at odds with some time course estimates derived from picture processing and word recognition studies. Whereas some authors argue that during picture processing the brain engages in semantic analysis already before 150 ms after stimulus onset (e.g., Thorpe et al. 1996; Johnson and Olshausen 2005; Hauk et al. 2007), others dispute the existence of such early differences (e.g., Holcomb and McPherson 1994; Kiefer 2001; Eddy et al. 2006; Sitnikova et al. 2006). Similarly, some ERP studies of visual word recognition converge in showing that lexical processing starts within 200 ms of picture onset (e.g., Sereno et al. 1998; Hauk and Pulvermuller 2004; Hauk et al. 2006, 2009) but other studies using the masked priming paradigm showed that it takes at least 250–300 ms to start retrieving lexical information (e.g., Holcomb and Grainger 2006; Grainger and Holcomb 2009). Here we sought to obtain new evidence regarding the time course of lexical access in natural speech production by asking high proficient bilinguals to simply name pictures overtly while recording ERPs.
As hinted above, recording electroencephalography (EEG) signals during overt speech is methodologically disputable, due to articulation-related artefacts. Indeed, activation of the mouth and face musculature produces electrical potentials larger than brain-generated signals by a factor of 10–100. These large motor and motor preparation potentials, well beyond the EEG amplitudes collected in silent response tasks such as button-presses, could potentially overshadow the cognitive brain activity at interest. However, one ERP study, directly comparing overt versus covert speech (Eulitz et al. 2000), and several MEG studies of overt picture naming (e.g., Levelt et al. 1998; Salmelin et al. 1994; Maess et al. 2002) have shown that reliable measures of brain activity can be taken at least until 400 ms after picture onset. As far as we are interested in early phases of picture naming, this simple paradigm should offer insight into the time course of lexical retrieval.
In order to trigger ERP differences during lexical access in speech production, we chose to manipulate word frequency as an independent variable. There is ample evidence that word frequency affects the speed and accuracy with which picture naming is performed: pictures with high-frequency names tend to be named faster and more accurately by normal and brain-damaged speakers than pictures with low-frequency names (e.g., Oldfield and Wingfield 1968; Dell 1990; Jescheniak and Levelt 1994; Navarrete et al. 2006; Almeida et al. 2007; Kittridge et al. 2007). These results suggest that lexical frequency influences or even determines the speed of lexical access (for a similar approach in language comprehension see e.g., King and Kutas 1998; Sereno et al. 1998; Hauk and Pulvermüller 2004).
However, 2 issues must be kept in mind: 1) lexical frequency correlates with conceptual variables such as imageability and familiarity and 2) the stage at which word frequency exerts its effects remains debated. These 2 considerations pose difficulties for the interpretation of ERP differences driven by lexical frequency.
One way to circumvent issue (1) is to manipulate another independent variable that exerts a reliable and consistent effect in picture naming but that is not confounded by conceptual properties. In the present study we tested bilingual individuals and we manipulated not only lexical frequency but also the cognate status of target picture names. The cognate effect refers to the observation of faster naming latencies in bilingual speakers for pictures whose translations are phonologically similar across languages (e.g., the Spanish–English pair “guitarra”— guitar) as compared with pictures with phonologically dissimilar translations (e.g., the Spanish–English pair “perro”—dog; e.g., Costa et al. 2000, 2005; Christoffels et al. 2007). Critically, the cognate status of a word depends on its phonology (formal similarity across translations) and is therefore unrelated to conceptual variables of the sort described above (see description of stimuli and appendix B). (There has been one proposal arguing that cognate status might influence conceptual processing; Van Hell and De Groot 1998. Cognates should have larger conceptual overlap compared with noncognates. However, such an account has difficulties in explaining the performance of brain-damaged speakers or certain tip-of-the-tongue states and seems to have problems with theoretical logic [for a clear overview see Costa et al. 2005]. Furthermore, results from Van Hell and De Groot's (1998) study are not that clear for concrete cognates [the type of stimuli used here] and are based on the sole assumption that word association does not involve lexical processes. Finally, the absence of any significant differences between cognate and noncognate words for the conceptual ratings of the stimuli used in the present study is at odds [at least for concrete cognates in high proficient bilinguals] with a conceptual account.). In a recent ERP study of overt naming involving cognates, Christoffels et al. (2007) found a significant negative enhancement around 300 ms after stimulus presentation for cognate ERPs compared with noncognate ERPs, which was interpreted in support of the phonological origin of the cognate effect.
Issue (2), that is, the locus of the frequency effect in speech production, is more complex to address. According to some researchers, word frequency affects the retrieval of lexical nodes from the lexicon (e.g., Caramazza et al. 2001; Navarrete et al. 2006; Almeida et al. 2007), whereas others argue that frequency only affects the retrieval of phonological information during production (e.g., Jescheniak and Levelt 1994; Jescheniak et al. 2003). Our study can only be informative regarding the time course of lexical access if indeed word frequency influences the retrieval of lexical representations and not just the retrieval of phonological segments. However, we believe this issue to be an empirical one. Importantly, recent behavioral, neuroimage and patient studies all provide clear evidence supporting the notion that frequency affects both lexical and phonological processing in speech production (e.g., Navarrete et al. 2006; Graves et al. 2007; Kittridge et al. 2007; Knobel et al. 2008). Thus, according to this new evidence, it is reasonable to expect early effects for frequency (150–200 ms, e.g., Indefrey and Levelt 2004).
To summarize, our main goal was to index the onset of lexical access in speech production by comparing ERP differences elicited by word frequency and cognate status in highly proficient Spanish–Catalan speakers naming pictures in L1 (Experiment 1) and highly proficient Catalan–Spanish speakers naming pictures in L2 (Experiment 2). We hypothesized that the point in time where the frequency and cognate effects induced significant ERP differences (i.e., the point in time where the ERPs start to diverge) would provide new insights regarding the onset of lexical access. We referred to Indefrey and Levelt's (2004) study to predict the time windows in which the effects of word frequency and cognate status should be expected. In the case of word frequency, we predicted to find an early lexical modulation between 150 and 200 ms. For the cognate effect, based on the ERP study of Christoffels et al. (2007), we predicted to find a slightly later modulation (after 275 ms).
Note that our predictions are solely based on temporal information and not bound to the modulation of specific ERP components. The advantage of such a design is that it will strongly simplify the interpretation of results. It is noteworthy that not many studies have involved the recording of ERPs during overt naming (e.g., Christoffels et al. 2007; Ganushchak and Schiller 2008; Koester and Schiller 2008; Verhoef et al. 2009). However, these studies have investigated bilingual language control, error monitoring and morphological priming in picture–word interference, respectively; therefore the patterns of brain activity elicited in the current study, that is, in the context of simple picture naming, maybe quite different.
Experiment 1: Highly Proficient Spanish–Catalan Bilinguals Naming Pictures in L1
Eighteen participants took part in the experiment. All were highly proficient early Spanish–Catalan bilinguals (see Appendix A) exposed almost exclusively to Spanish during their first 3–4 years of life and reported to have a preference or dominance for that language. Participants were students at the University of Barcelona (ages 18–25) and received course credits or monetary compensation for their participation. All were right-handed, had normal or corrected-to-normal vision and did not suffer from any neurological or motor problems. Two of the participants had to be removed from the analyses: one due to an unacceptable number of EEG artefacts, and another due to technical problems during EEG recording. Statistical analyses are thus based on 16 individual data sets.
Sixty-four line-drawings of familiar objects corresponding to Spanish words were selected, spanning a wide range of semantic categories (e.g., body parts, buildings, animals, furniture; see Fig. 1). Two independent variables were manipulated orthogonally: cognate status and lexical frequency of the picture names. This design therefore entailed 4 experimental conditions involving 16 pictures each (see Appendix B). Pictures with high-frequency names were at least 10 times more frequent than pictures with low-frequency names (mean lemma frequency [LEXESP; Sebastián-Gallés et al. 2000]: low frequency: 3.9; high frequency: 52). The mean lexical frequency of the picture names in the cognate and noncognate sets were similar (noncognates: 27.8; cognates: 32.2). The cognate words shared on average 4 phonological segments with their translation equivalents (range = 2–8). Almost all cognates (27 out of 32) shared at least the whole first syllable with their corresponding translations, and all of them shared at least the first phoneme. There was no obvious phonological or orthographic overlap between noncognates. None shared their first phoneme and only 7 out of 32 shared the first vowel appearing in a word. Words in the 4 conditions were also matched for length (mean syllable length: low-frequency noncognates: 2.7; low-frequency cognates: 2.6; high-frequency noncognates: 2.4; high-frequency cognates: 2.2). Physical variance within each of the 4 stimulus sets was evaluated using interstimulus pixel-wise correlations (Thierry et al. 2007), and no difference was found between experimental conditions (Fig. 1). In addition, 40 students who did not participate in the study rated the complexity of the stimuli using a Spanish adaptation of the methods described in Snodgrass and Vanderwant's (1980) study. We found no differences between any of the 4 stimulus sets taken 2-by-2 (P > 0.2). Finally, all items used in the experiment were rated on 3 conceptual variables (familiarity, typicality, and imageability) by 120 students not tested in the ERP experiment, using a Spanish adaptation of the Snodgrass and Vanderwart (1980) procedure. Low- and high-frequency items only differed significantly (P < 0.001) for the familiarity ratings but not on typicality or imageability (P > 0.7). Critically, there were no differences between cognates and noncognates for any of the 3 conceptual properties (P > 0.5). Finally, to increase experimental power, each picture was presented once in each of 6 separate blocks (which makes 64 stimuli per block) with order of the presentation within blocks randomized for each participant.
Participants were tested individually in a soundproof room. Instructions were administered in Spanish. Participants were asked to name each picture presented in Spanish as fast and as accurately as possible. Before the experiment started, participants were familiarized with the pictures and they were given feedback (correct picture name) when they made a non response or an incorrect response (23%). Each experimental trial had the following structure: 1) a blank interval of 1000 ms was shown at the centre of a computer screen; 2) a picture was displayed at the centre of the screen until a response was given or for a maximum of 1500 ms; 3) a blank intertrial interval of 1000 ms. An asterisk was presented for 500 ms before the first trial and after the last trial to signal the beginning and end of each block. Blocks were separated by a 30-s pause. Response latencies were measured from the onset of the stimulus by means of a voice key. Stimulus presentation was controlled by an adaptation of the EXPE Program (Pallier et al. 1997). The entire experimental session lasted approximately 25 min. At the end of the experimental session, participants were asked to fill in a questionnaire regarding language use and proficiency in their 2 languages.
The EEG was continuously recorded from 31 scalp locations (Fp1, Fpz, Fp2, F3, Fz, F4, F7, F8, FC1, FC2, FC5, FC6, C3, Cz, C4, CP1, CP2, CP5, CP6, P3 , Pz, P4, PO1, PO2, T3, T4, T5, T6, O1, Oz, O2) using tin electrodes embedded in an elastic cap. Five additional electrodes were placed on the participants’ head. Two bipolar electrodes were placed next to and beneath the left eye (EOGH and EOGV) to register eye movements; 2 other electrodes were placed on the participants’ right and left mastoid (A1 and A2) for on-line referencing and a final electrode was placed on the participants’ nose as off-line reference channel. The EEG was continuously recorded and digitized at 250 Hz. Impedances were reduced to 3 kOhms or less prior to the beginning of recording. Before segmentation, the EEG was processed through a low-pass filter with a cut-off frequency of 20 Hz and a high-pass filter of 0.03 Hz. The EEG was then segmented into 750-ms-long epochs starting 200 ms prior to stimulus onset. We chose to segment the EEG only up to 550 ms after stimulus onset to avoid speech contamination (e.g., Wohlert 1993; Masaki et al. 2001). Just as in Christoffels et al. (2007), Verhoef et al. (2009), and Koester and Schiller (2008) we assumed that analyzing the ERPs before the actual response would result in artifact-free ERPs. In contrast to those studies, however, we chose to segment in such a way that 1) fast responses could not induce motor artifacts and 2) latency jitter by strong EMG activity could be avoided in the averaging. For the naming latencies, the fastest average response was 650 ms. We segmented the EEG up to 550 ms (100 ms less than the fastest average response, to ensure that we could include as many epochs as possible) and removed prior to averaging all epochs where the response was faster than 550 ms. The segmented EEG underwent Gratton and Coles (1989) ocular correction and artifact rejection where trials with an amplitude voltage over 100 μV or a change in amplitude between adjacent samples of more than 200 μV were dismissed. Also trials where participants’ responses were incorrect or absent and trials containing other motor artifacts were rejected from the dataset before averaging. The 750-ms epochs were then averaged in reference to the 200-ms prestimulus baseline. In total, ERP analyses were based on an average of 162 segments (SD = 21) per condition taken 2-by-2 (e.g., low-frequency noncognates + low-frequency cognates for the low-frequency condition) per subject (low frequency: 164, high frequency: 160, noncognate: 164, cognate: 160).
Four types of responses were scored as errors: 1) production of incorrect names; 2) verbal disfluencies (stuttering, utterance repairs, production of nonverbal sounds that triggered the voice key); 3) recording failures; 4) errors in which the bilingual participants named the picture in Catalan. This gave a total of 1.2% erroneous responses. Furthermore, outliers (i.e., responses exceeding 3 standard deviations from the participant's mean, 2.7%) and responses faster than 550 ms (10.6%) were excluded from all analyses. For the ERP analysis another 4.7% of the data were excluded due to artefacts. Given the very low amount of erroneous responses (1.2%) we decided not to run a statistical error analysis.
Separate subject (F1) and item (F2) analyses were carried out examining 3 independent variables: Cognate Status (cognates vs. noncognates), Frequency Status (low frequency vs. high frequency) and Block (6 repetitions).
The main goal was to identify the latency at which the ERPs of the 2 contrasts of interest (low-frequency vs. high-frequency ERPs and cognate vs. noncognate ERPs) started to diverge significantly from one another. We adopted a method suggested by Guthrie and Buchwald (1991; see also Thierry et al. 1998, 2003). ERPs were compared between conditions at each electrode by running 2-tailed paired t-tests at every sampling point (4 ms) starting from target presentation (0 ms) until at least a sequence of 12 consecutive t-test samples exceeded the 0.05 significance level. We also estimated the splitting point, that is, the point in time where ERPs started to diverge in each individual subject, in order to perform a correlation analysis between this splitting point and the mean naming latencies of each subject.
Second, 3 time window analyses were conducted to explore possible interactions between frequency and cognate status, and to explore the effect of repetition in the ERPs: 1) a 2 × 2 × 9 repeated measures ANOVA was conducted with Frequency Status (low frequency vs. high frequency), Cognate Status (noncognate vs. cognate), and Electrode (9 electrode clusters; see below) as independent variables; 2) a 2 × 6 × 9 ANOVA was conducted with Frequency Status, Block (6 blocks) and Electrode as independent variables; 3) a 2 × 6 × 9 ANOVA was conducted with Cognate Status, Block, and Electrode as independent variables. Five time windows were selected, based upon visual inspection of the Grand Averages: 1) 0–80 ms, 2) 80–160 ms, 3) 160–240 ms (P2), 4) 240–320 ms (N3), and 5) 320–420 ms (P3). Note that P2, N3 and P3 are used as descriptive labels here. Electrodes were clustered in 9 groups as follows: Left frontal (Lfr): F7,F3,FC5; Fronto-central (Fc): Fz,FC1,FC2,Cz; Right frontal (Rfr): F8,F4,FC6; Left Central (Lc): T3,C3,CP5; Centro-Parietal (Cpar): CP1,CP2,Pz; Right central (Rc): T4,C4,CP6; Left Parietal (Lpar): T5,P3,PO1; Occipital (Oc): O1,Oz,O2; Right Parietal (Rpar): T6,P4,PO2).
Thirdly, because the method presented here is new for studying speech production, correlation analyses were performed to help us understand the relationship between the different phases of information processing indexed by ERPs and the relationship of these phases with behavioral results. On the one hand, correlation analyses were performed on the individual splitting point in the ERPs with the individual mean naming latencies for the high frequency and cognate condition, because these are the measures likely to show whether or not the spitting point indexes the onset of lexical access. On the other hand, correlation analyses were performed on the peak latencies of the P2, N3, and P3. Peak latencies were measured at the electrode of maximum amplitude for each component in each subject (Picton et al. 2000). Finally, a correlation analysis was performed on individual differences in naming latencies (the frequency effect and the cognate effect in each subject) and the individual mean amplitude differences for each ERP processing phase in each subject. All correlation analyses were conducted making use of the same electrode clusters as for the time window analyses.
In the analysis of the naming latencies the main effects of Block (F1(5,75) = 8.9, MSE = 4867.9, P < 0.001; F2(5,300) = 14.7, MSE = 2139.9, P < 0.001) and Frequency Status (F1(1,15) = 47.9, MSE = 3564.1, P < 0.001; F2(1,60) = 8.5, MSE = 19,890.2, P = 0.005) were significant, and a marginally significant effect of Cognate Status was found (F1(1,15) = 42.1, MSE = 1603.9, P < 0.001; F2(1,60) = 3.7, MSE = 19,890.2, P = 0.060). Participants named pictures with high-frequency names faster than pictures with low-frequency names, and pictures with cognate names faster than pictures with noncognate names (see Table 1). The interaction between Frequency Status and Cognate Status was significant only in the analysis by subjects (F1(1,15) = 8.0, MSE = 2484.8, P = 0.013; F2(1,60) = 1.4, MSE = 19,890.2, P = 0.250). That is, the cognate effect was larger for high-frequency words (41 ms) than for low-frequency words (12 ms). Finally, there was also a significant interaction between Cognate Status and Block in the subject analysis (F1(5,75) = 3.6, MSE = 1088.5, P = 0.014; F2 < 1). However as can be seen in Figure 2 this interaction was mostly driven by the much smaller cognate effect in the second block compared with the other blocks. None of the other interactions were significant (P > 0.1; see Fig. 2).
|Low-frequency noncognates (ms)||Low-frequency cognates (ms)||High-frequency noncognates (ms)||High-frequency cognates (ms)|
|Experiment 1 (naming in L1)||730||718||702||661|
|Experiment 2 (naming in L2)||764||742||737||694|
|Low-frequency noncognates (ms)||Low-frequency cognates (ms)||High-frequency noncognates (ms)||High-frequency cognates (ms)|
|Experiment 1 (naming in L1)||730||718||702||661|
|Experiment 2 (naming in L2)||764||742||737||694|
Onset Latency Analyses
ERPs displayed a typical P1–N1–P2 peak sequence classically observed for visual stimulus presentation. t-Tests at each sampling rate indicated that high-frequency ERPs started to diverge significantly from low-frequency ERPs 172 ms after picture onset (see Figs 3a and 4a). As can be seen in Figure 4a, the distribution of electrodes showing a significant effect at this time point was particularly left-lateralized (however 4 ms later, almost all electrodes showed significant differences). The averaged splitting point computed from individual splitting point estimates was 167 ms, that is, almost within one time sampling unit of the splitting time measured in the grand-averages. Cognate ERPs started to diverge significantly from noncognate ERPs 200 ms after picture onset (see Figs 3b and 4b). The distribution of electrodes displaying significant differences at this time point was more right-lateralized (but also here at the next time point practically all electrodes showed significant differences). Again the grand-average splitting time of 200 ms closely resembled the averaged individual splitting point (192 ms). The difference in onset between the splitting point of the frequency effect and the splitting point of the cognate effect was significant (measured by individual splitting points; P = 0.05).
Time Window Analyses
Early time windows (0–80 ms and 80–160 ms).
The only effect found for all 3 ANOVAs conducted in the early time windows was a small trend for Block between 80 and 160 ms (F(5,75) = 2.2, MSE = 20,943.4, P = 0.098). In comparison to the first repetition, all subsequent repetition ERPs seemed to be more positive going. However, independent ANOVAs of the possible contrasts with correction for multiple comparisons showed no significant effects (P > 0.150). None of the other main effects or interactions were significant (P > 0.175).
Late time windows (160–240 ms (P2), 240–320 ms (N3), and 320–420 ms (P3).
In all 3 time windows significant main effects were present for Frequency Status (P2: F(1,15) = 8.9, MSE = 13.6, P = 0.009; N3: F(1,15) = 33.1, MSE = 31.1, P < 0.001; P3: F(1,15) = 39.1, MSE = 18.8, P < 0.001) and Cognate Status (P2: F(1,15) = 10.6, MSE = 12.1, P = 0.005; N3: F(1,15) = 49.9, MSE = 25, P < 0.001; P3: F(1,15) = 28.1, MSE = 22.4, P < 0.001). ERPs in the high-frequency condition were significantly more negative than those elicited in the low-frequency condition and ERPs in the cognate condition were significantly more negative than those elicited in the noncognate condition. The distribution of this effect emerged at posterior sites of the scalp, but expanded rapidly over large parts of the entire scalp (see Fig. 5). The interaction between Frequency Status and Cognate Status was significant, but only for the 2 latest time windows (P2: F < 1; N3: F(1,15) = 6.9, MSE = 14, P = 0.019; P3: F(1,15) = 16.6, MSE = 9.4, P = 0.001). Starting around 240-ms poststimulus presentation, amplitude differences between cognate and noncognate ERPs were larger in the high than in the low-frequency condition. There was neither a main effect of Block nor interactions of Block with Frequency Status and/or Cognate Status in any of the 3 time windows (P > 0.45).
Pearsons product–moment correlation analyses showed no significant correlation between the individual splitting point of the low-frequency versus high-frequency ERPs with the individual mean naming latencies of the high-frequency condition, and no significant correlation between the individual splitting point of the noncognate versus cognate ERPs with the individual mean naming latencies of the cognate condition (P > 0.6). The correlations between the individual peak latencies of the P2, N3, and P3 were not significant either (P > 0.2).
Significant positive Pearson product–moment correlations were found in the P2 range between the difference in mean amplitude between high and low-frequency ERPs and the high-/low-frequency naming latency difference over left-parietal (R = 0.531, P = 0.034) and right parietal electrodes (R = 0.511, P = 0.043), and a trend towards a positive correlation for the P2 at the right-central electrodes (R = 0.455, P = 0.077): The larger the difference in naming latencies between high- and low-frequency words, the larger the difference in P2 mean amplitude between high- and low-frequency conditions.
In the N3 range, only a small trend was found in the same direction over left (R = 0.477, P = 0.082) and right parietal (R = 0.429, P = 0.098) electrode clusters. Finally, in the P3 range, there were no significant correlations between the frequency effect in the naming latencies and the mean amplitude differences in the ERPs (P > 0.250).
Analyses of the cognate effect revealed a remarkably similar pattern of correlations: Trends towards positive correlations were found for the cognate effect in the P2 range over left parietal (R = 0.465, P = 0.070) and right parietal (R = 0.436, P = 0.091) electrode clusters. The trend for correlation found in the P2 range disappeared in the N3 (P > 0.190) and P3 (P > 0.475) ranges. Both the frequency effect and the cognate effect seemed to relate mostly to the early P2.
As expected pictures with high-frequency names yielded faster naming latencies than those with low-frequency names, and cognate picture names were produced faster as compared with noncognate picture names. Interestingly, high-frequency ERPs started to diverge significantly from low-frequency ERPs 172-ms post-target presentation, with high-frequency picture names eliciting greater negativities than low-frequency names. Similar results were found for the cognates: cognate ERPs started to diverge significantly from the noncognate ERPs 200-ms postpicture presentation with more negativity in the cognate than in the noncognate condition. For both the frequency and the cognate manipulations, these differences remained from the moment of the split until the end of the epoch. The absence of peak latency correlations between the components that were significantly modulated (P2, N3, P3) suggests that these components have different functional underpinnings and that the early differences in the P2 range are not merely the consequence of latency jitter caused by stronger variations registered at a later time in each epoch. Finally, it was shown that the ERP modulations for the frequency and cognate effects were not influenced by repeating the same stimuli in different blocks.
The time estimate of 172 ms found for the frequency effect is consistent with the time estimate of 150–175 ms for lexical access proposed by Indefrey and Levelt (2004). Based on these authors’ time estimates, both the frequency and the cognate effect seem to have an early, lexical influence during speech production. Recall that we used cognate status because it has no obvious relationship with conceptual variables (see stimulus ratings) as control for possible conceptual confounds of the frequency effect. In fact, when plotting cognate and frequency ERPs together, ERP morphology was remarkably comparable (see Fig. 6a). Given this similarity and the fact that the cognate manipulation was not confounded by conceptual variables, we may interpret the time of split as a good approximation of lexical access rather than as a difference driven by conceptual factors.
We also need to consider the possibility that the frequency effect on ERP amplitude may have its origin at a phonological rather than a lexical level (e.g., Jescheniak and Levelt 1994). This, however, seems unlikely given that the first significant ERP differences for both high and low-frequency picture names and cognate and noncognate picture names occur very early. Unless one assumes that, once an object is sufficiently recognized to initiate language related processing (around 150 ms after picture presentation; e.g., Thorpe et al. 1996; Schmitt et al. 2000), phonological retrieval unfolds in parallel with lexico-semantic and syntactic retrieval of a word, a phonological account for our findings seems very implausible. Because the literature does not support the idea that phonological and lexical processing proceed in parallel (e.g., Salmelin et al. 1994; Van Turennout et al. 1997, 1998; Indefrey and Levelt 2004), we can conclude that the results here are indexing lexical access rather than phonological retrieval. Even more, the present findings do not only show that the first differences in the ERPs start to emerge at the early P2, but they also show that the most reliable correlations between ERP mean amplitude differences and naming latency differences for the contrasts of interest were present at this component. This suggests that both these phenomena have an influence during lexical processing and also that they affect speech production most strongly at an early point in time, that is, between 160 and 240 ms after picture presentation. In addition, the fact that no correlations were present between the individual splitting points in the ERPs for the contrasts at interest and the individual naming latencies of the fastest conditions of these contrasts (high-frequency and cognate conditions), suggests that the time estimate derived from the point of divergence is indeed indicative for the start (or at least a very early stage) of lexical access as opposed to its termination. Before considering the theoretical implications of these findings further, the robustness and reliability of these results were consolidated by running a replication experiment. We decided to test Catalan–Spanish bilinguals performing the same task in their L2 (Spanish) to test whether the timing of the differences would map onto those found in bilinguals doing the task in their L1. If the rationale used in the previous experiment is correct we should be able to replicate these results, regardless of whether participants name the pictures in their L1 or their L2.
Experiment 2: Catalan–Spanish High Proficient Bilinguals Naming in L2
Seventeen participants took part in the experiment. All were highly proficient early Catalan–Spanish bilinguals (a description of both languages and relative use is presented in Appendix A) exposed almost exclusively to Catalan during their first 4–5 years of life (see Appendix A) and students at the University of Barcelona (age range 18–25). Participants received course credits or monetary compensation for their participation. All were right-handed, had normal or corrected-to-normal vision and did not suffer from any neurological or motor problems. One of the participants had to be removed from the analyses due to an unacceptably high number of artefacts, leaving 16 participants in the statistical analyses reported below.
Stimuli and Procedures (including EEG acquisition) were identical to those in Experiment 1.
In general, except when specified, all analyses were identical to those conducted in Experiment 1.
There were 1.1% erroneous responses, 4.4% outliers, and 3.9% fast responses which were excluded from the behavioral and ERP analyses. In addition another 5.3% of the trials were excluded from the ERP analyses due to artefacts. We decided not to run a statistical error analysis due to the very low error rate.
The behavioral analyses were identical to Experiment 1.
The ERP analyses were in all aspects identical to those described in Experiment 1, except that the time windows were slightly shifted: 1) 0–80 ms, 2) 80–180 ms, 3) 180–260 ms (P2), 4) 260–350 ms (N3), and 5) 350–450 ms (P3). In total, ERP analyses were based on an average of 172 segments (SD = 21) per condition per subject (low frequency: 173, high frequency: 171, noncognate: 172, cognate: 172).
Both in the subject and item analyses, significant main effects of Frequency (F1(1,15) = 45.4, MSE = 2934.7, P < 0.001; F2(1,60) = 9.9, MSE = 15014.7, P = 0.003) and Cognate Status (F1(1,15) = 33.7, MSE = 2986.2, P < 0.001; F2(1,60) = 7.8, MSE = 15014.7, P = 0.007) were found. These results replicated those of Experiment 1 and indicated the presence of frequency and cognate effects in the naming latencies (see Table 1). The main effect of Block only reached significance in the item analysis (F1(5,75) = 2.1, MSE = 10,589.6, P = 0.151; F2(5,300) = 7.3, MSE = 2397.6, P < 0.001). As in Experiment 1, a significant interaction between Frequency and Cognate Status was present for the subject analysis (F1(1,15) = 11.1, MSE = 956.9, P = 0.005; F2 < 1). The cognate effect was larger for high-frequency words (43 ms) compared with low-frequency words (22 ms). Finally, there was a significant interaction between Frequency and Block in the analysis by subjects (F1(5,75) = 4.9, MSE = 932.3, P = 0.003; F2(1,60) = 1.6, MSE = 2397.6, P = 0.158) and a trend toward an interaction between Cognate Status and Block (F1(5,75) = 2.2, MSE = 851.7, P = 0.081; F2 < 1) only in the subject analysis. As in Experiment 1 these interactions with Block did not reveal a stable pattern, but rather randomly varying sizes of the frequency and cognate effects from block to block (see Fig. 7). No other interaction reached significance (P > 0.5).
Onset Latency Analyses
As in Experiment 1, ERPs displayed a typical P1-N1-P2 peak sequence (see Fig. 8). t-Tests at each sampling rate indicated that high-frequency ERPs started to diverge significantly from low-frequency ERPs 184 ms after picture onset (see Figs 8a and 9a) over anterior, central and posterior midline electrodes (at the next time point almost all electrodes showed significant effects). This time estimate was close to the averaged individual splitting point (177 ms). The cognate ERPs started to diverge significantly from noncognate ERPs 184 ms after picture onset (see Figs 8b and 9b). Furthermore, the time estimate derived from the grand-averages splitting point overlapped completely with the averaged individual splitting latencies (184 ms). The distribution of electrodes showing significant effects at this time point was widely spread over the scalp (see Fig. 9b). No significant difference was present between the individual splitting latency of the frequency effect and that of the cognate effect (P = 0.4).
Time Window Analyses
Early time windows (0 – 80 ms and 80 – 180 ms).
None of the ANOVAs conducted showed significant effects in these time windows (P > 0.350).
Late time windows (180–260 ms (P2), 260– 350 ms (N3), and 350–450 ms (P3).
In all 3 time windows significant main effects were found for Frequency (P2: F(1,15) = 8.9, MSE = 13.6, P = 0.009; N3: F(1,15) = 74.6, MSE = 23.3, P < 0.001; P3: F(1,15) = 33.3, MSE = 39.2, P < 0.001) and Cognate Status (P2: F(1,15) = 10.6, MSE = 12.1, P = 0.005; N3: F(1,15) = 58.3, MSE = 22.3, P < 0.001; P3: F(1,15) = 20.6, MSE = 28.5, P < 0.001). ERPs in the high-frequency condition were significantly more negative going compared with those elicited by the low-frequency condition and ERPs in the cognate condition were significantly more negative going than those in the noncognate condition. Both effects were widely distributed over the scalp, reaching their maximum at posterior and right-frontal sites (see Fig. 5). The interaction between Frequency and Cognate Status was significant, but only for the 2 later time windows (P2: F < 1; N3: F(1,15) = 18.8, MSE = 14.1, P = 0.001; P3: F(1,15) = 20.9, MSE = 11.5, P < 0.001). It is noteworthy that this pattern of results is identical to that found in Experiment 1.
Finally, as in Experiment 1, in none of the 3 time windows there was a significant effect of Block (P > 0.250), or interactions between Block and Frequency (P > 0.150) or Cognate Status (F < 1).
As in Experiment 1, no significant correlations were found between the individual splitting point of the frequency and cognate contrasts in the ERPs with the individual mean naming latencies of those contrasts in the behavioral data (P > 0.4), nor did we find significant correlations between the individual peak latencies of the P2, N3, and P3 (P > 0.36).
Significant positive Pearson product–moment correlations were found between P2 mean amplitude difference between high and low frequency and difference in naming latencies between the high- and low-frequency condition at fronto-central (R = 0.512, P = 0.042) and left central electrodes (R = 0.492, P = 0.053). There was also a trend towards a positive correlation at centro-parietal electrodes (R = 0.447, P = 0.082).
In the N3 range marginally significant positive correlations between the frequency effect in the naming latencies and the frequency effect in the ERPs were present at left parietal (R = 0.492, P = 0.053) and left central electrodes (R = 0.483, P = 0.058). Small trends toward positive correlations in the same direction were present at centro-parietal (R = 0.465, P = 0.070) and right central electrode clusters (R = 0.427, P = 0.099). Finally, for the P3, only a small trend towards a positive correlation was present at left parietal electrode sites (R = 0.428, P = 0.098).
For the cognate contrast a similar pattern of correlations was found: There were significant correlations between the difference in P2 mean amplitude between cognate and noncognate conditions and the difference in naming latencies between cognate and noncognate conditions at left frontal (R = 0.535, P = 0.033) and left central electrode clusters (R = 0.502, P = 0.047). A trend towards a positive correlation was present at fronto-central electrodes (R = 0.449, P = 0.081). As in Experiment 1 these correlations disappeared in the N3 (P > 0.190) and the P3 (P > 0.560) range, respectively.
Comparison with Experiment 1
Differences in ERP splitting point latencies and RTs between experiments (i.e., between the 2 participant groups) were not significant (all P > 0.1). However, we did observe a marginally significant main effect in the P2 range (and also for subsequent peaks) between Groups (naming in L1 vs. naming in L2; F(1,30) = 3.4, MSE = 77.1, P = 0.060; see Fig. 6b) and, importantly, a marginal significant interaction between Group and Frequency (F(1,30) = 3.8, MSE = 7.2, P = 0.062; see Fig. 6b). The frequency effect showed a larger amplitude difference at the P2 in Experiment 2 (naming in L2; difference: 1.9 μV) compared with Experiment 1 (naming in L1; difference: 1.1 μV). None of the other interactions with Group turned out significant (all P > 0.1).
The results of Experiment 2 in a different group of participants performing the picture naming task in their L2 were overall highly similar to those obtained in Experiment 1. First, naming latencies displayed the expected frequency and cognate effects, and the interaction between frequency and cognate status was also replicated. (At the moment we do not have an explanation for this interaction. A similar interaction has been reported before in behavioural picture naming experiments, e.g., Ivanova and Costa 2008, but also the reverse interaction; e.g., Christoffels et al. 2003, and sometimes none; e.g., Costa et al. 2000. It might be that the presence and direction of this interaction depends on differences of [uncontrolled] stimulus properties between experiments. Independently, in both experiments the interaction becomes apparent in the ERPs only after the splitting point and the P2 [i.e., after 240 ms], which is beyond the period of interest in this paper.) Second, in the ERPs, early differences between low and high frequency ERPs (184 ms) and between cognate and noncognate ERPs (184 ms) were observed in a similar time window as those found in experiment 1. Third, the pattern of correlations resembled that seen in Experiment 1. Indeed, there were no correlations between the individual splitting points of the experimental contrasts and the individual naming latencies of the fastest condition of those contrasts. There were no correlations between P2, N3, and P3 peak latencies either, but significant positive correlations between individual frequency and cognate effects in the naming latencies with mean amplitude differences for both the frequency and cognate effects in the ERPs. These correlations were strongest in the P2 range and weaker or absent in the N3 and P3 ranges.
Only 2 qualitative differences were found between the 2 experiments:
1) The frequency effect had an earlier onset compared with the cognate effect in Experiment 1, but this latency difference was absent in Experiment 2. Faster conceptual processing for high frequency words due to their higher familiarity (a conceptual property), which was not present for the cognate contrast, could account for the faster engagement in lexical processing for those items and consequently for the earlier splitting latency. However, the similarity in splitting latency for lexical frequency and cognate status in Experiment 2 invalidates this account. In addition, given that analyses of splitting latencies for lexical frequency and cognate status are based on different sets of stimuli (50% different), it is likely that conceptual processing has a different duration on average for each picture set (in that sense the similarity in Experiment 2 is more surprising than the different onset in Experiment 1). Nevertheless, ERPs had the same morphology in both conditions (see Fig. 6a) and amplitude differences were found in the time window previously associated with lexical processing (175–250 ms according to Indefrey and Levelt 2004).
Second, we encountered more positive P2 amplitudes in Experiment 2 (naming in L2) as compared with Experiment 1 (naming in L1; see Fig. 6b). This is an important observation in light of our theoretical claims. (This observation is also of importance regarding the bilingual naming disadvantage in the non dominant language; e.g., Indefrey 2006; Ivanova and Costa 2008). However, because this topic does not fall under the scope of the present study, this result will only be discussed in light of the theoretical claims made in the present study. These findings with respect to the bilingual naming disadvantage will be discussed elsewhere [including more subjects in each group and adding a within-subjects experiment].) One may argue that due to the interactivity of the brain, different representational systems which are interconnected, such as the lexical system and the semantic (or object—imaginal—representation) system, may benefit and even share processing activation from one domain to the other (e.g., Paivio 1986). In such a scenario our results would still reflect the first influence of lexical variables during speech production processing (given the cognates), but not necessarily processing solely at a lexical level. This being said, the dual-code view cannot, however, account for the amplitude difference between L1 and L2 naming in the P2 range (see Fig. 6b). Naming in L1 versus L2 should activate the exact same semantic (object) representation (e.g., Kroll and Stewart 1994) and a difference between the 2 can only be explained at the lexical level where the representational format is distinct (recall that subjects are early, highly proficient bilinguals using both languages on a daily basis, and that stimuli were concrete words). Although the validity of between group comparisons can be disputed (especially with ERPs), it is difficult to imagine that 2 groups of participants viewing the exact same images overall would display between-group differences by chance in the same time range (∼192 ms) and in the same manner as differences generated by lexical frequency and cognate status manipulations. This is especially true because we also encountered a (marginally) significant interaction with lexical frequency between experiments. Such pattern of results is unlikely to sprout from coincidental between-group differences. A stronger P2 modulation for the frequency effect during L2 naming compared with L1 naming can only be readily explained by assuming that these effects occur at the lexical level. In addition, the fact that for lexical frequency there was a difference in familiarity (a conceptual property) while this difference was absent for cognate status, and that both variables elicited similar ERP effects, also argues against the possibility that present results merely reflect lexical influences during conceptual processing.
The main aim of this study was to characterize the time course of lexical access in speech production using a high-resolution temporal technique, event-related potentials. Two effects known to affect picture naming latencies were investigated: the word frequency and the cognate effects. Participants showed reliable and robust frequency and cognate effects in both experiments, replicating previous studies (e.g., Oldfield and Wingfield 1965; Jescheniak and Levelt 1994; Costa et al. 2000; Navarrete et al. 2006; Almeida et al. 2007; Christoffels et al. 2007). Crucially, we found early ERP effects of frequency and cognate status, independently of whether naming was performed in L1 or L2. High-frequency ERPs diverged from low-frequency ERPs at around 180 ms after picture onset (172 ms in Experiment 1, 184 ms in Experiment 2) coinciding with the onset of a positive wave (P2), with the high lexical frequency condition eliciting lower ERP amplitudes than the low-frequency condition. A similar pattern of results was found for cognates: pictures with cognate names started to diverge from those elicited by pictures with noncognate names around 190 ms after stimulus presentation (200 ms in Experiment 1; 184 ms in Experiment 2), with noncognates eliciting greater amplitudes than cognates.
The Word Frequency Effect as an Index of Lexical Access
The early effect of word frequency in picture naming suggests that speakers start the lexicalization process very early on after picture presentation. That is, to the extent that word frequency affects lexical retrieval, we propose that participants started the lexicalization processes between 150 and 200 ms after picture onset. This time estimation is consistent with results from MEG studies (e.g., Levelt et al. 1998; Maess et al. 2002) and covert “lexical” ERP studies (e.g., Schmitt et al. 2000), and fundamentally agrees with the meta-analysis conducted by Indefrey and Levelt (2004). Importantly, the early effect of word frequency during picture naming is unlikely to index the time of retrieval of target lexical representations given the absence of correlation between ERP splitting latencies and naming latencies for high-frequency words. Instead, we propose that this effect coincides with initial activation and retrieval operations within the lexicon. That is, the point in time where a lexical representation gets activated and enters in the competitive process of selection, with high-frequency items showing a head start over low-frequency items due to their (permanently) higher activation levels. The splitting point between ERPs can therefore be seen as the transition phase between conceptual and lexical selection processes (see e.g., Thorpe et al. 1996; Hauk et al. 2007; for object recognition time estimates).
Besides the descriptive chronometric information provided by our results, the present findings also have implications for the locus of the frequency effect in speech production. As mentioned in the introduction, word frequency affects solely the retrieval of the phonological properties of a word according to some authors (e.g., Jescheniak and Levelt 1994; Jescheniak et al. 2003). Under this assumption, we should have expected the ERPs in the high- and low-frequency conditions to start diverging when a word's phonological code is supposed to be retrieved, that is, around 275 ms according to Indefrey and Levelt (2004). The presence of a word frequency effect at ∼180 ms is at odds with this position, and suggests that word frequency also affects the speed with which lexical items are retrieved from the lexicon (e.g., Caramazza et al. 2001; Navarrete et al. 2006; Almeida et al. 2007).
However, finding an early effect of word frequency does not discard an effect at later processing stages, such as that of phonological encoding. Indeed, our results show that differences between low-frequency and high-frequency ERPs are present in later time windows as well. In fact, the amplitude of the ERPs for high- and low-frequency words correlated positively with naming latency differences for time windows at which lexical (the P2 window) but also phonological encoding (the N3 window) are supposed to take place. Therefore, considering hypothetically that different cognitive processes take place in these different time windows (e.g., lexical access -early stage-, phonological access -late stage-), we can conclude that both processing stages are affected by word frequency. This view of the ubiquitous effects of word frequency is consistent with recent hemodynamic evidence showing that word frequency modulates the activity of brain areas thought to be involved in the retrieval of lexico-semantic as well as phonological information (Graves et al. 2007) and recent studies with brain-damaged patients showing frequency effects for semantically and phonologically related errors and errors resulting in nonwords (e.g., Kittridge et al. 2007; Knobel et al. 2008).
A possible caveat when interpreting the early effect of word frequency in the ERP data is the potential correlation of this variable with conceptual variables such as familiarity and imageability. Indeed, stimuli ratings on conceptual variables conducted for the pictures used in the present experiments showed significant familiarity differences between the low-frequency and high-frequency items. This means that the early ERP effects for frequency could be driven by these conceptual differences. However, we also found early cognate effects in the ERPs, in a rather similar time window to that of the frequency effects. In fact, when plotted together the electrophysiological signature of the cognate effect and the frequency effect are practically identical (see Fig. 6a) and the correlation patterns between behavioral differences and electrophysiological differences are also similar. Given that -unlike word frequency cognate status is not correlated with conceptual variables (see also stimuli ratings), the early effect of cognate status cannot be interpreted as a mere effect of correlated conceptual variables, but rather some sort of lexical effect. Thus, if one is willing to interpret the cognate effect at such early time as revealing lexical processing, it is reasonable to interpret word frequency effects along the same lines.
On the Origin of the Cognate Effect
As mentioned above, the results of the cognate manipulation are useful when interpreting the word frequency effects. In addition, the early effect of cognate status also sheds light on its origin. Interestingly, such an effect was also descriptively reported in the study by Christoffels et al. (2007, personal communication), who found significant differences between cognate and noncognate ERPs as early as 175 ms after picture presentation.
The parallel results observed for word frequency and cognate manipulations suggest that the 2 effects might have the same origin at the lexical level. Because of the phonological overlap between a cognate and its translation, every time a cognate word is heard or uttered, both the target lexical representation and its translation are strongly activated, irrespective of the language of utterance. In contrast, when a noncognate word is produced or heard, the translation word will probably not be activated that strongly, given the lack of phonological overlap. Following this rationale, cognate lexical representations should have a higher frequency than noncognate lexical representations, because the former are activated more often. (Both interactive, e.g., Dell 1986, as sequential, e.g., Levelt, et al. 1998, models of speech production can nicely capture this assumption. According to interactive models, the activated phonological segments of the target word will send activation back to any word with which they are linked. In such scenario, the utterance of a cognate word will cause its translation to become lexical activated as well, due to feedback activation from shared phonological segments onto the lexical level with which these segments are linked. A similar process will not unfold for noncognate translations, because they hardly share phonological segments; see Costa et al. 2005. Sequential models can explain a cognate effect at the lexical level in a similar manner through comprehension: every time a cognate is heard [also through perception of one's own voice], the similar phonological content will also activate the nontarget lexical representation, whereas for noncognates this bottom-up activation via phonology will only result in activation of the target word.) Consequently, the cognate effect may reflect a word frequency effect in disguise, with cognate words behaving as high-frequency words and noncognate words as low-frequency words (see also Kirsner et al. 1993; Sánchez-Casas and García-Albea 2005, for an alternative explanation of cognate effects at the lexical level). Note that a similar explanation at the conceptual level cannot account for the cognate effect because the conceptual representations of cognates and noncognates are considered to be shared between L1 and L2 (e.g., Kroll and Stewart 1994).
Independently of the precise nature of the cognate effect, our results reveal that cognate status has an effect at early stages of lexical access. However, as in the case of word frequency, this early effect does not preclude effects of cognate status at subsequent processing levels. For instance, Christoffels et al. (2007) reported ERP cognate effects between 275 and 375 ms after target presentation, with enhanced negative amplitude for cognate relative to noncognate ERPs. Consistent with this late effect, we found a similar modulation at the N3. Thus, as reported previously for word frequency, cognate status appears to affect both lexical processing (P2 range) and phonological encoding (N3 range), and therefore seems to affect picture naming latencies in a similar way as word frequency.
Methodological Issues and ERP Components of Interest
So far we have mainly discussed the data in the temporal domain. Given the novelty of EEG studies using overt picture naming, it is pertinent to dedicate some words to the ERP components of interest identified in the experiments. Before doing so, however, some potential methodological pitfalls need to be discussed.
In the experiments there were 16 stimuli per condition (32 per experimental contrast), and many repetitions were needed (6) in order to obtain enough trials per condition. Stimulus repetition in ERPs can be a substantial source of modulation (e.g., Bentin and McCarthy 1994; Rugg and Doyle 1994). The repetition effects in this study were however negligible and, critically, there were no interactions between repetition and other factors in either experiments, suggesting that stimulus repetition did not affect the frequency and cognate effects reported. In particular, the P2 modulation by frequency and cognate status had the same magnitude in all experimental blocks. The small magnitude of repetition effects observed, may be due to 1) the fact that we did not record the first presentation of the stimuli (familiarization phase). Therefore we could not measure the ERP differences between the familiarization phase and the first experimental presentation, where the strongest ERP repetition effects are to be expected. Indeed, ERP studies using multiple repetitions have shown that ERP differences elicited by repetition are largest for the first repetition and seem to decrease or even vanish with subsequent repetitions (e.g., Gruber and Müller 2005; Friedman and Cycowicz 2006); 2) the irregularity of the lag between repeated items (e.g., Henson et al. 2004); 3) the relatively large average lag between repeated stimuli (e.g., Henson et al. 2004).
Another possible confound in the present study is that different physical stimuli were used in the different experimental conditions. This could result in spurious ERP amplitude modulations caused by physical stimulus differences rather than the cognitive manipulation of interest (e.g., Picton et al. 2000). However, because 50% of the stimuli were completely different between the high-frequency and cognate condition, finding similar time courses of differences and correlation patterns by chance is unlikely. In addition, the ERP pattern correlated with the behavioral results, where naming latency differences are less likely to sprout from distinct physical stimuli. Finally, when comparing directly L1 against L2 naming overall, that is, when comparing ERPs elicited by the exact same set of pictures in 2 different naming contexts, we find the same P2 modulation as in experiments 1 and 2 taken separately (see Fig. 6b).
The ERP components observed here were not systematically interpreted in a traditional way because this study focused on divergence latencies and amplitude-naming latency correlations. Nevertheless, the peaks observed may be related to classical components described in the literature. For instance, P3 mean amplitude was more pronounced for low- than for high- frequency words, which may suggest postlexical reprocessing of word-related information (e.g., Polich and Donchin 1988; Hauk and Pulvermüller 2004). However, given the marked positivity of the P3 component in our dataset, amplitude modulation could also partially reflect early stages of motor preparation.
Perhaps the most interesting results were found for the P2 range, given the correlation between frequency and cognate naming latency and P2 mean amplitude difference between high- and low-frequency picture names and between cognates and noncognates. Individuals showing larger frequency and cognate naming latency effects also showed bigger P2 mean amplitude differences for the same contrasts. In other words, P2 amplitude appears to reflect the ease of lexical access, with lower amplitudes associated with easily accessible representations (high frequency words and cognates) and larger amplitudes associated with less accessible representations. Such amplitude modulations are consistent with Hebbian theory of cell assemblies (e.g., Pulvermüller 1999; Hauk and Pulvermüller 2004). Another possible interpretation for the P2 comes from word recognition experiments manipulating vocabulary class (e.g., King and Kutas 1998; Brown et al. 1999; Osterhout et al. 2002). These studies reported, in a similar vein as observed in the present study, reduced P2 amplitudes for closed class (faster condition) as compared with open class words (slower condition). Although these studies did not elaborate much on this finding, it was suggested that the P2 modulation might reflect attentional differences between nonlexical aspects of the stimuli such as length (cf., Mangun and Hillyard 1995). For the present study however, such an account does not seem valid. Given the results observed for the cognates, as explained extensively above, the P2 modulations reported here have to be related, at least in part, to linguistic processes. It is possible that these P2 effects are indeed confounded by attention, with rare stimuli eliciting larger attentional shifts than more common stimuli (e.g., Luck and Hillyard 1994), but this would not take away the value of our observation because, in that case, these P2 differences most likely reflect attentional resources needed during lexical activation. Future research exploring further the functional characteristics of the P2 component will probably provide fundamentally new insights regarding lexical access in speech production.
For the first time, early electrophysiological differences elicited by manipulation of lexical frequency and cognate status during overt picture naming were established. Based on the latency of ERP divergences between conditions, lexical access is estimated to occur at around 180 ms after target presentation. Aside from this important chronometric contribution, the present study offers a promising new paradigm using a simple task and a simple experimental design to study the time course of speech production. In addition, by identifying an early electrophysiological correlate of lexical processing, this study may be a starting point for approaching a variety of psycholinguistic phenomena in language production from a whole new perspective.
Spanish Government grant (PSI2008-01191); Project Consolider-Ingenio 2010 (CSD 2007-00012); and Spanish Government (FPU-2007-2011) predoctoral fellowship supported K.S.
We would like to thank Phillip Holcomb and 2 anonymous reviewers for their helpful comments to the previous version of this manuscript. We would also like to thank Elin Runnqvist for her help in revising this manuscript. Conflict of Interest: None declared.
Appendix A. Language history and the self-assessed proficiency for all participants
Language history and self-assessed proficiency scores of participants. Mean age and SD are given in years. The onset of L2 acquisition refers to the mean age (in years) at which participants started learning Catalan/Spanish. “Use of L2” refers to how long (in years) participants have been using the L2 regularly. The proficiency scores were obtained through a questionnaire filled out by the participants after the experiment. The scores are on a 4 point scale, in which 4 represents native speaker level; 3, good level; 2, medium level; and 1, poor level of proficiency. The self-assessed index is the average of the participants’ responses to 4 domains (speech comprehension, speech production, reading, and writing).
|Language history||Age||L2 onset||L2 use||# Years in Catalunya|
|Spanish–Catalan bilinguals||20 (2)||4 (2)||16 (4)||20 (1)|
|Catalan–Spanish bilinguals||22 (2)||5 (3)||17 (2)||22 (1)|
|Language history||Age||L2 onset||L2 use||# Years in Catalunya|
|Spanish–Catalan bilinguals||20 (2)||4 (2)||16 (4)||20 (1)|
|Catalan–Spanish bilinguals||22 (2)||5 (3)||17 (2)||22 (1)|
Appendix B: List of stimuli used in the Experiment
|Low-frequency noncognates||Low-frequency cognates|
|Low-frequency noncognates||Low-frequency cognates|
|High-frequency noncognates||High-frequency cognates|
|High-frequency noncognates||High-frequency cognates|