Keep calm and carry on: electrophysiological evaluation of emotional anticipation in the second language

Abstract Investigations of the so-called ‘foreign language effect’ have shown that emotional experience is language-dependent in bilingual individuals. Response to negative experiences, in particular, appears attenuated in the second language (L2). However, the human brain is not only reactive, but it also builds on past experiences to anticipate future events. Here, we investigated affective anticipation in immersed Polish–English bilinguals using a priming paradigm in which a verbal cue of controlled affective valence allowed making predictions about a subsequent picture target. As expected, native word cues with a negative valence increased the amplitude of the stimulus preceding negativity, an electrophysiological marker of affective anticipation, as compared with neutral ones. This effect was observed in Polish–English bilinguals and English monolinguals alike. The contrast was non-significant when Polish participants were tested in English, suggesting a possible reduction in affective sensitivity in L2. However, this reduction was not validated by a critical language × valence interaction in the bilingual group, possibly because they were highly fluent in English and because the affective stimuli used in the present study were particularly mild. These results, which are neither fully consistent nor inconsistent with the foreign language effect, provide initial insights into the electrophysiology of affective anticipation in bilingualism.


Introduction
The human brain can be conceived as a prediction machine that builds upon prior experience to anticipate forthcoming sensory stimuli (Engel et al., 2001;Bar, 2007Bar, , 2009Van Berkum, 2010, construct emotional experiences (Barrett and Bar, 2009;Barrett and Simmons, 2015;Hoemann et al., 2017) or make decisions (e.g. Summerfield and de Lange, 2014;de Lange and Fritsche, 2017). A growing body of literature suggests that the brain goes well beyond being a reactive organ merely responding to incoming stimuli, and that it constantly prepares the resources needed to increase operational efficiency (Sterling, 2012). Electrophysiological studies of anticipatory processing have investigated slow waves that build up between the presentation of a warning stimulus (S1) and a target stimulus (S2), by comparing when S1 announces S2 and when it does not (Brunia et al., 2011a). In classical threat-of-shock experiments, for instance, participants expect an electrical shock (S2) on the basis of a preceding warning (S1; e.g. Böcker et al., 2001). An event-related brain potential (ERP) modulation of particular interest in such S1-S2 paradigm is the contingent negative variation (CNV) (Walter et al., 1964), consisting of an early phase, reflecting attention to and processing of the warning stimulus, and a late phase, reflecting S2 anticipation and motor response preparation.
Other studies have found an SPN amplitude increase when participants anticipate arousing, affective pictures (Takeuchi et al., 2005;Poli et al., 2007;Moser et al., 2009;Thiruchselvam et al., 2011;Perri et al., 2014). For example, Poli et al. (2007) measured the extent to which the anticipation of negative and positive pictures would modulate SPN amplitude and heart rate. Participants were first presented with an affective (e.g. 'blood') or neutral (e.g. 'people') verbal cue preparing them for upcoming pictures of varying valence (positive, negative and neutral) and arousal (high arousal and low arousal). Irrespective of affective valence, SPN amplitudes increased and heart rate slowed down in the anticipation of highly arousing pictures (e.g. erotic content or injuries) relative to low arousing pictures, even though Poli et al. (2007) used only six words as S1 cues. Swannell et al. (2016) also found that words associated with the concept of pain modulated participants' anticipation of a heat stimulation delivered through a laser, when S1 presentation was subliminal, i.e. unconscious.
Here, we set out to investigate whether affective anticipation can be modulated by the language in which S1 is presented, within the same participants, and when S1 valence is matched between languages, in order to test the hypothesis of dampened emotional response when imbalanced bilinguals operate in their second language (Wu and Thierry, 2012;Jończyk, 2016a;Jończyk et al., 2016;Sheikh and Titone, 2016;Baumeister et al., 2017;Iacozza et al., 2017;Jankowiak and Korpal, 2017).
For instance, in a recent eye-tracking study, Iacozza et al. (2017) reported increased pupil dilation, indexing physiological arousal, in 27 Spanish-English bilinguals reading sentences with a negative connotation in Spanish, their native language (L1), as compared with English, their second language (L2). In the same vein, Jankowiak and Korpal (2017) asked 27 Polish-English bilinguals to listen to or read fragments that were negative (e.g. narrating someone's death) or neutral (e.g. describing a city) in Polish or in English. When comparing English to Polish contexts, the authors reported decreased electrodermal activity in participants reading negative descriptions, but this was only found for reading as opposed to listening to the descriptions. These findings corroborate earlier electrophysiological evidence for reduced amplitudes of the N400, an electrophysiological index of semantic processing, selectively when negative information was presented in L2, whether in word pairs (Wu and Thierry, 2012) or naturalistic sentences (Jończyk et al., 2016). Thus, operating in the non-native language appears to provide late bilinguals with some kind of affective protection vis-à-vis negative emotional content. These findings provide neurophysiological support for bilinguals' subjective reports about their emotional experiences in L1 and L2 in which the bilinguals' L2 is often portrayed as being emotionally more distant, disembodied, or detached from early memories compared with their L1 (for reviews, see Pavlenko, 2012, Caldwell-Harris, 2015and Jończyk, 2016a. Other studies have reported effects consistent with the above reviewed results but in terms of a relatively greater sensitivity to positive information in L1, which can be considered the mirror image of reduced sensitivity to negativity in L2. For example, Hsu et al. (2015) observed increased hemodynamic response to positive extracts from Harry Potter when these were presented to 24 German-English bilinguals in their L1 as compared with their L2. Gao et al. (2015) further showed that positive feedback delivered in Chinese to 16 Chinese-English bilinguals incited them to take 10% more risk in the following even-probability gambling trial, irrespective of amounts to be won or lost, whereas no difference was observed between positive and negative conditions when feedback was provided in English. Overall, despite the initial null results reported in electrophysiological studies of emotionlanguage interaction in bilinguals (Conrad et al., 2011;Opitz and Degner, 2012), the evidence accumulated so far sways in favor of a moderating effect of the second language on emotional experience in late bilinguals.
However, previous studies showing dampened affective sensitivity in L2 stand in contrast to results from cognitive studies using behavioral paradigms such as the emotional Stroop task (Sutton et al., 2007;Eilola et al., 2007;Grabovac and Pléh, 2014; but see Winskel, 2013), the affective priming task  or the emotional word recall and recognition task (Ferré et al., 2010;Ferré et al., 2013; but see Baumeister et al., 2017). Indeed, the latter studies have reported little or no difference in the automaticity of emotional word processing across L1 and L2 (for most recent reviews, see Caldwell-Harris, 2015;Jończyk, 2016a).
In this study, we examined, for the first time, the neurophysiological effects of language of operation on anticipatory affective processing, that is how the bilingual brain prepares for an upcoming emotional event. We employed the S1-S2 paradigm commonly implemented in studies of anticipation in combination with ERPs to investigate the relative amplitude of anticipatory potential variations elicited by stimulus cues (see Wu and Thierry, 2017, for a similar approach in the context of preparation for speaking). To our knowledge, only one study has previously investigated a possible interaction between affective anticipation and language of operation using physiological measures (García-Palacios et al., 2018). However, that study focused on language use (counting task), not semantic processing, and targeted physiological measures that are a proxy for stress levels in a fear conditioning context (pupil dilation and electrodermal activity).
In the present experiment, participants were presented with a negative, positive or neutral verbal cue (S1), either in their L1 (Polish) or in their L2 (English). The cue predicted the valence and meaning of the target picture (S2) in 50% of trials. Following picture presentation, participants were asked to determine whether the target picture and verbal cue were related in meaning or not. Consistent with previous SPN studies (van Boxtel and Böcker, 2004;Moser et al., 2009;Ohgami et al., 2014;Shafir et al., 2015), we focused on an early and a late time window of analysis to obtain a comprehensive picture of the anticipatory period preceding S2 presentation. We focused predictively on the relative impact of negative compared with neutral cues when late Polish-English bilinguals anticipated the display of emotional pictures, since positive stimuli usually afford less sensitivity.
Consistent with a hypothetical mechanism of repression (Wu and Thierry, 2012), we predicted more negative SPN amplitudes in both the early and late time windows of the SPN following a negative vs a neutral cue in L1 Polish but no such modulation for L2 English cues. We also recruited a group of monolingual English controls to ensure that the English stimuli would successfully elicit affective anticipation effects in native speakers of the language and thus expected to find SPN modulations in both the early and the late time windows in the group of native English speakers tested in English.

Participants
Twenty-one Polish-English bilinguals and twenty-one English monolingual speakers from the area of North Wales gave informed consent to take part in the study that was approved by the ethics committee of Bangor University, Wales, UK. Data from four bilingual and three monolingual participants were discarded due to insufficient number of clean segments of electrophysiological (EEG) data per condition or excessive alpha contamination. This resulted in a final participant sample of 17 bilinguals (M age = 23.4, standard deviation [SD] = 3.9; 6 males, 11 females) and 18 monolinguals (M age = 24.4, SD = 2.7; 12 males, 6 females). Except one ambidextrous monolingual participant, participants from both groups were right-handed. Also, all participants were residing in North Wales at the time of the experiment and reported having (corrected-to-) normal vision. The bilingual participant group consisted of Polish native speakers who acquired English after puberty. All participants were immersed in the British culture at the time of testing, but immersion time varied widely, from 3 months to 16 years (M immersion = 6.6 years, SD = 5.8). Constraining participants selection on the basis of cultural exposure was not possible given the relatively small size of the Polish community in North Wales. The participants reported using both Polish and English on everyday basis, in both formal and informal contexts. Their English proficiency in reading, writing, speaking and listening was self-reported using an adapted version of the Language History Questionnaire 2.0 (Li et al., 2014; Table 1). The control monolingual participant group consisted of native English speakers who acquired English after birth and had limited exposure to other languages. Participants were compensated for their time with £12.

Stimuli
The experiment conducted with bilinguals comprised of 62 positive, 62 negative and 124 neutral English words selected from an affective word database (Warriner et al., 2013), paired with 62 positive and 62 negative pictures taken from Google image online databases. The word-picture pairs were related or unrelated in meaning based on a pre-experimental norming study. Each word served as a cue to elicit anticipation of a subsequently presented picture in four affective cue-target combinations: negative wordnegative picture, positive word-positive picture, neutral word-negative picture and neutral word-positive picture. In each case, 62 combinations were related and 62 were unrelated. Unrelated pairs were created by repairing word cues and pictures within the positive and negative conditions and making sure through pre-experimental norming that no spurious relatedness randomly arose. Polish word cues were translations of the English word cues implemented by a highly proficient Polish-English bilingual with expertise in bi-directional Polish-English translation. To validate stimulus selection, all items were backtranslated from Polish to English by six proficient Polish-English bilinguals who did not take part in the study: 85% of translations overlapped three or more times out of six. The remaining 15% yielded fewer stable translations, but most were closely related or synonymous (e.g. 'rich' for 'wealthy', 'tumor' for 'cancer', 'baby' for 'new-born', etc.). Inter-rater agreement between translators was high (Intraclass Correlation Coefficient, ICC = 0.84).
Zipf values for word lexical frequencies were collected from SUBTLEX-UK (van Heuven et al., 2014) and SUBTLEX-PL (Mandera et al., 2015) databases. In English, no significant differences were observed in lexical frequencies between positive (M = 4.38; 95% confidence interval [CI]  .49]) and neutral words, P bonf < 0.001), with no difference between positive and negative words (P bonf = 0.1). Based on post-experimental valence norming reported in a previous experiment with the same population (Jończyk, 2016b;Jończyk et al., 2016) and the fact that Polish words were direct translations of the English words, we assumed rough comparability in valence and arousal between languages. Nevertheless, we tested this assumption post-experimentally with a group of Polish speakers who did not take part in the experiment. The 248 Polish words used in the experiment were rated on valence and arousal on a scale from 1 to 9 (1 = highly negative/low arousing; 9 = highly positive/highly arousing) by 29 and 26 individuals, respectively. The scale and rating procedures followed Warriner et al. (2013), to ensure comparability between scales and instructions for further analysis. The scores obtained from this study and the database by Warriner et al. (2013) were subjected to two 3 (valence: positive; negative; neutral) × 2 (language: Polish; English) by-item ANOVAs. The emotionality analysis revealed a main effect of valence, F (2,490) = 1982, P < 0.001, η 2 = 0.89, with significant differences between negative (M = 2.32; 95% CI [2.20, 2.43]), positive (M = 7.39; 95% CI [7.28, 7.50]) and neutral (M = 5.4; 95% CI [5.32, 5.48]) words, P s < 0.001. The valence × language interaction was non-significant, F (2,490) = 1.15, P = 0.30, η 2 = 0.01. The arousal analysis revealed a main effect of valence, F (2,490) = 236.24, P < 0.001, η 2 = 0.49, with higher arousal ratings for negative (M = 5.63; 95% CI [5.47, 5.79]) and positive (M = 5.35; 95% CI [5.19, 5.52]) words relative to neutral words (M = 3.6; 95% CI [5.19, 5.52]; P s < 0.001). Furthermore, negative words were rated as marginally more arousing than positive words, at P = 0.06. Finally, the arousal analysis revealed a significant valence × language interaction, F (2,490) = 26.95, P < 0.001, η 2 = 0.09, with higher arousal ratings for negative and positive words in Polish (M negative = 6.01, 95% CI [5.77,6.24 Note, however, that ratings for Polish stimuli were obtained from a group of Polish native speakers, while ratings for English stimuli were obtained from the database by Warriner et al. (2013); hence, small differences across ratings were expected. For a graphical representation of valence and arousal ratings, see Figure 1.
Prior to the experiment, 174 participants rated 496 wordpicture pairs for semantic relatedness on a scale from 1 (completely unrelated) to 7 (highly related). Unrelated pairs consisted of word-picture pairings in which the meaning of a word was not reflected in a picture; related pairs consisted of word-picture pairings in which the meaning of a word was reflected in a picture. To avoid word repetition, the pairs were counterbalanced across participants so that each verbal cue appeared only once in each version of the norming study. This resulted in four different versions of the norming study, each containing 124 word-picture pairs. A 3 (cue: positive, neutral, negative) × 2 (relatedness: related, unrelated) by-item ANOVA was run, with valence as a between-item factor, to assess potential differences in verbal cue-picture relatedness between pairs of varying affective valence. A main effect of relatedness was found, whereby related word-picture pairs were rated as more related (M = 6.03, 95% CI [5.94, 6.12]) than unrelated (M = 2.01, 95% CI [1.92, 2.10]) word-picture pairs (F (1,490) = 3648.8, P < 0.001, η 2 = 0.87). The interaction between word cue valence and relatedness did not reach significance, F (2,490) = 0.65, P = 0.52, η 2 = 0.0, indicating that positive, negative and neutral words did not significantly differ with regard to relatedness to target pictures.

Procedure
Participants were seated 100 cm away from a cathode-ray tube (CRT) monitor in a dimly lit and quiet room. Following EEG cap preparation, during which they completed a short questionnaire, participants were familiarized with the task. They were told that in each trial they would see a verbal cue announcing a picture target. In half of the trials (124 per language), the verbal cue was affectively neutral and was followed by either a positive or a negative picture target. Thus, in this condition, the cue did not allow strong anticipatory effects to take place. In the other half of the trials, the verbal cue was positive (62 per language) or negative (62 per language), and it was followed by a picture target of the matching affective valence. Thus, when the cue was affectively arousing, it enabled affective anticipation to take place. Following a word cue, target pictures were either related (50%) or unrelated (50%). For instance, the word cue 'accident' could be followed either by a picture depicting a motorbike accident or by the picture of a slave. Participants were asked to indicate whether the picture was related to the preceding verbal cue or not by pressing one of two designated buttons. Participants were not explicitly informed about the emotional relationship between cue words and pictures.
The structure of a trial was as follows: the word cue announcing a target was presented for a random duration between 300 and 400 ms (in steps of 10 ms), followed by a fixation cross that remained on the screen for 3800 ms-the anticipation window. Subsequently, a picture was flashed for 200 ms and participants had to respond within 2300 ms. After every 10 trials, a pink fixation cross was displayed for 5000 ms during which time participants could rest their eyes and blink as needed. Across the experimental session, each picture was presented twice to English native speakers and four times to Polish-English bilinguals. As for word cues, they were never repeated within language in English native speakers or across languages in Polish-English bilinguals.
Bilingual participants completed two blocks in English and two blocks in Polish. Languages were alternated between blocks; block order and response sides were counterbalanced between participants. Monolingual participants completed only two blocks of trials in English. Two researchers, one native speaker of Polish and one native speaker of English were present at all times, enabling a short exchange in the language of the forthcoming block after each pause.

EEG recording and analyses
Electrophysiological data were continuously recorded in reference to electrode Cz at a rate of 1000 Hz from 64 Ag/AgCl electrodes placed according to the extended 10-20 convention. The vertical and horizontal electrooculograms (EOGs) were recorded from electrodes located above and below the right eye and at the outer canthus of each eye. Impedances were kept below 5 kΩ. EEG signals were amplified with Neuroscan SynAmps2 amplifier unit (El Paso, TX) and filtered online with a band pass filter between 0.05 and 200 Hz.
All pre-processing steps and analyses were performed using EEGLAB Toolbox (version 14.1.1; Delorme and Makeig, 2004) and ERPLAB Toolbox (version 6.1.4; Lopez-Calderon and Luck, 2014) in MATLAB (version R2017a, The Mathworks, Inc.). The signals were down-sampled offline to 250 Hz and all data were visually inspected for abnormalities. Sections of continuous data containing gross muscle artefacts were rejected. Abnormal channel activity was identified using the trimOutlier plugin (Lee and Miyakoshi SCCN, INC, UCSD) as well as by plotting channel spectra and maps in EEGLAB. No more than four channels were rejected per dataset in both the bilingual (M = 1.32; min = 0, max = 4) and monolingual (M = 0.35; min = 0, max = 2) participant groups. The rejected channels were interpolated using the spherical spline interpolation method. Next, continuous (nonsegmented) data were digitally re-referenced to the average of all scalp electrodes (global average reference), excluding EOGs, and band-pass filtered between 0.1 and 30 Hz using a second-order infinite impulse response Butterworth digital filter (slope: 12 dB/oct). Subsequently, the extended infomax independent component analysis (ICA; Lee et al., 1999) was run to correct for vertical and horizontal EOG artefacts. The independent components (ICs) were inspected using a semiautomated procedure. First, the IClabel plugin (Pion-Tonachini et al., 2019) was run to automatically classify ICs into broad source categories. Subsequently, selected ICs corresponding to artefactual activity originating in the eyes were visually inspected by plotting component activations. The mean number of rejected ICs per participant amounted to 1.47 (SD = 0.71; min = 1, max = 3) in the bilingual and 1.66 (SD = 0.59; min = 1, max = 3) in the monolingual group. Prior to accepting ICA correction, we plotted the EEG data before and after ICA correction to make sure that rejecting ICs did not impact the data in an adverse way.
Following ICA correction, epochs were extracted from the continuous EEG. For SPN analysis, 4000 ms epochs were extracted, starting 200 ms before anticipation cue onset. For the N400 analysis, 1000 ms epochs were extracted, starting 200 ms before target picture onset. Baseline correction was applied relative to pre-stimulus activity. Epoch rejection was based on the result of a peak-to-peak moving window in ERPLAB (threshold: ±80 μV; window size: 200 ms; window step: 100 ms) and subsequent visual inspection. Tables 2 and 3 present the mean number of accepted epochs per condition for the SPN (max = 62) and N400 (max = 124) analyses, respectively, in each participant group.
We focused on two ERP components: SPN, indexing anticipatory processes, and N400, indexing cue-picture integration;  the ERP components were analyzed predictively based on prior findings. SPN was analyzed over three centrofrontal midline electrodes (AFZ, FCz and Fz) in two temporal windows, 800-1200 ms post-cue (early SPN) and 300 ms prior to picture presentation (late SPN; e.g. Moser et al., 2009); N400 was analyzed over nine electrodes (FC1, FCz, FC2, C1, Cz, C2, CP1, CPz and CP2) in the 350-500 ms time window (see Kutas and Federmeier, 2011). Statistical analyses were conducted on mean SPN amplitudes in early (800-1200 ms) and late (3500-3800 ms) time window of predicted maximal sensitivity for early and late SPN, respectively. We first conducted repeated measures (RM) ANOVA on the SPN mean amplitudes in the bilingual group, with language (Polish, English) and valence (negative, neutral) as within-subject independent variables (we left the positive valence condition aside, given its commonly acknowledged reduced ability to elicit reliable SPN modulations). We then used planned comparisons by means of two-tailed paired t-tests to test the following predictions. (i) SPN amplitude should differ between negative and neutral anticipation conditions in Polish participants processing words in Polish. (ii) The same contrast should be reduced in amplitude or canceled when Polish participants are tested with English word cues. (iii) SPN amplitude should differ between negative and neutral anticipation conditions in English participants processing words in English. Furthermore, (iv) there should be an overall difference in SPN amplitude between English and Polish version of the experiment in Polish-English bilinguals, with more negative amplitude when testing was conducted in Polish than in English (Wu and Thierry, 2017). Note that we did not conduct a between-group comparison in this study because the conditions under which the two groups were tested varied in several ways (bilinguals knew two languages as opposed to only one, they underwent a double session comprising four blocks rather than two, they saw four instances of each picture rather than two and they had to switch between languages across experimental blocks) making such a comparison invalid. Thus, English native speakers in this study were considered a control group to independently validate the English stimuli. Finally, (v) we predicted a significant modulation of N400 amplitude elicited by picture targets when comparing unrelated vs related cuepicture pairs, irrespective of group or language, but we had no prediction as regards the direct comparison of Polish and English cue-picture pairs in Polish-English bilinguals. N400 analyses were performed on correct trials only by means of RM ANOVA with relatedness (related, unrelated), valence (negative, neutral) and language (Polish, English) as within-subject independent variables.  (Figure 4). N400 topography was asymmet- Fig. 3. Event-related potentials elicited by cue language in the bilingual group. Waveforms illustrate brain potential variations computed via linear derivation from three midline frontal electrodes (AFZ, FZ and FCZ). The window of analysis starts just after verbal cue presentation, coinciding with the display of a fixation cross. The shaded area highlights the window of significant differences. Black dots on the topographical maps depict the electrode sites of interest. Bar graphs represent mean amplitude averaged over the electrodes of interest in the early (left panel) and late (right panel) SPN window. Error bars depict 95% confidence interval. Fig. 4. ERPs elicited by picture targets (S2) as a function of relatedness with the preceding verbal cue (S1) in the bilingual group. Waveforms illustrate brain potential variations computed via linear derivation from nine centro-parietal electrodes (FC1, FCz, FC2, C1, Cz, C2, CP1, CPZ and CP2). Time 0 coincides with picture presentation onset. The shaded area highlights the window of significant differences. Black dots on the topographical maps depict the electrode sites of interest. The bar graph represents mean N400 amplitude averaged over the electrodes of interest in the window of analysis. Error bars depict 95% confidence interval. ric, with slightly greater amplitudes over the right hemisphere. Furthermore, neither the main effect of language, F(1,16) = 0.9, P = 0.35, η 2 p = 0.06 (see, Figure 5), nor the main effect of valence, F(1,16) = 1.5, P = 0.26, η 2 p = 0.08, nor any of the interactions turned out to be statistically significant (P > 0.05).  Figure 7). Neither the main effect of valence nor the relatedness × valence interaction turned out to be significant (P > 0.05). In the similar vein to the bilingual group, N400 topography was asymmetric, with slightly greater amplitudes over the right hemisphere.

Discussion
We investigated whether the anticipation of emotional information is contingent upon the language of operation in bilinguals. We hypothesized that anticipating a negative picture cued by a Fig. 5. ERPs elicited by picture targets (S2) as a function of the language (L1 Polish or L2 English) of the verbal cue (S1) in the bilingual group. Waveforms illustrate brain potential variations computed via linear derivation from nine centro-parietal electrodes (FC1, FCz, FC2, C1, Cz, C2, CP1, CPZ and CP2). Time 0 coincides with picture presentation onset. Black dots on the topographical maps depict the electrode sites of interest. The bar graph represents mean N400 amplitude averaged over the electrodes of interest in the window of analysis. Error bars depict 95% confidence interval. negative word in L1 would lead to more pronounced amplitudes of the early and late SPN than when cued by a neutral word. Moreover, such difference would be significantly reduced or even fail to reach significance in L2. Electrophysiological data supported our hypothesis, showing increased amplitudes in the early and late SPN window in the anticipation of pictures cued by negative words in the bilinguals' L1 (Polish) but not in their L2 (English). As expected, the same English words with a negative valence elicited enhanced SPN amplitudes in the monolingual English control group, the effect being restricted to the late SPN window. However, the predicted differences in the bilingual group occurred in the absence of a significant critical interaction between language and valence.
We contend that the absence of an interaction between language and valence in bilinguals relates to three characteristics of the present study that should be considered in the planning of future experiments. First, online activation of translation equivalents is likely to have played a role in reducing the size of such a potential interaction, given that access to translation equivalents has been shown to be unconscious and automatic in sequential bilinguals (e.g. Thierry and Wu, 2007;Wu and Thierry, 2010). Even though such effects have also been shown to be reduced for negatively valenced words (Wu and Thierry, 2012), it must be kept in mind that the bilinguals involved in the current study were highly proficient in English and immersed in an English speaking environment for an average of 6 years (range 3 months-16 years). This means that language nonselective lexical access might have been more effective in the Polish-English bilinguals tested here than in a group of Chinese-English bilinguals having spent only a few months in the UK, for instance. And, indeed, the valence effect observed in bilinguals in the English cueing blocks was close to marginal (P = 0.08). In other words, while the interaction may have been measurable in less balanced bilinguals or using a different paradigm (Jończyk et al., 2016), it may well be that non-selective language access in the current sample was too strong to allow the interaction to appear. In addition, the affective pictures selected for this study were particularly mild (see detailed discussion of this point below) compared with the kind of highly arousing stimuli used in SPN experiments previously (e.g. from the IAPS database, Lang et al., 1997). Finally, the experiment involved a high-level of stimulus repetition (picture targets were seen four times each by bilingual participants), which will have likely increased habituation effect (a factor known to reduce SPN sensitivity, Catena et al., 2012).
Taken together, our findings corroborate previous evidence showing that linguistic cues, particularly those of negative valence, can elicit affective anticipation effects (Swannell et al., 2016). Furthermore, this effect could be reduced in the L2 of bilinguals, suggesting that language may act as a modulating factor of affective anticipation (see Poli et al., 2007;Brunia et al., 2011a;Swannell et al., 2016;Kotani et al., 2017). Future studies using less balanced bilinguals and more emotionally potent stimuli will hopefully confirm such modulation or discard it more clearly.
Previous studies on physiological anticipation often measured SPN at the FZ electrode site only (Moser et al., 2009;Thiruchselvam et al., 2011;Fuentemilla et al., 2013;Morís et al., 2013;Pornpattananangkul et al., 2017). Others reported more extensive scalp distribution covering the midline electrode sites (FPZ, FZ, FCz, Cz, CPZ, PZ, POZ;Poli et al., 2007). Here, we report a somewhat intermediate result, with SPN amplitude being maximal over AFZ, FZ and FCz. At least two methodological choices in the current experiment might have worked against a more widespread effect.
First, we elected not to rely on frequently used pictorial stimuli from affective picture databases (e.g. Lang et al., 1997;Dan-Glauser and Scherer, 2011;Marchewka et al., 2014), because we considered such pictures too arousing and potentially disturbing, or in some cases even traumatizing. Beyond the potential ethical issues arising from the use of such stimuli, we made this choice because we wanted our results to be more ecologically valid and more representative of everyday functioning of the human brain. Indeed, one rarely witnesses lethal accidents or mutilations, and thus, results from studies that have used emotionally extreme stimuli might have artificially inflated the emotional response or even elicited responses that are not the representative of everyday experience. Affective valence and arousal were, therefore, low in this experiment, as compared with the literature and thus the anticipation effects measured could be expected to be somewhat weaker. If the goal of future experiments is to establish reduced sensitivity in the second language to potentially traumatic stimuli, then experimenters might be advised to use far more potent affective materials.   7. ERPs elicited by picture targets (S2) as a function of relatedness with the preceding verbal cue (S1) in the monolingual group. Waveforms illustrate brain potential variations computed via linear derivation from nine centro-parietal electrodes (FC1, FCz, FC2, C1, Cz, C2, CP1, CPZ and CP2). Time 0 coincides with picture presentation onset. Shaded gray areas highlight the window(s) of significant differences. Black dots on the topographical maps depict the electrode sites of interest. The bar graph represents mean N400 amplitude averaged over the electrodes of interest in the window of analysis. Error bars depict 95% confidence interval.
Second, we used a fully rotated, counterbalanced experimental design likely to have further weakened the strength of anticipatory processes. While this avoided the need for verbal cue repetition in each of the languages, picture repetition might have reduced SPN amplitudes through habituation. A few of the bilingual participants indicated that they had been able to guess the upcoming picture in some cases in the third and/or fourth experimental blocks. Given that the experimental sessions were approximately twice as long in the bilingual group, and participants ended up being exposed to more repetition, which might have in turn reduced the anticipation effect to a greater extent. An alternative explanation is that our groups differed in affective suggestiveness. Future studies could use yoked control groups to test this possibility and more detailed psychometric measures may be needed to tap into betweengroup differences in general emotional sensitivity. Nevertheless, taking into account that the anticipation of uncertain outcomes tends to yield more intense experience (Catena et al., 2012) and considering that SPN amplitude tends to decrease as a function of learning , a potential valence-dependent effect of language on SPN amplitude could have been weakened in this study.
In line with previous studies of semantic processing in bilinguals, we found that SPN amplitudes were overall greater after an English than a Polish cue, which could be due to a partial overlap with the N400 component elicited by the cue, and generally consistent with the idea that processing information in the second language tends to require greater cognitive effort (Frenck-Mestre, 2002;Moreno and Kutas, 2005;Thierry and Wu, 2007;Martin et al., 2013). However, both early and late SPN modulations were observed well beyond the window of semantic processing, lending support to the fact that they did reflect anticipation of the target picture.
Moreover, we observed a language main effect in bilinguals, which could be interpreted as a sign of greater overall anticipation in L2 than in L1 or as a sign that when participants were exposed to L2 word cues, their L1 was supressed to a greater extent than their L2 when they were exposed to L1 word cues (an effect akin to that shown by Wu and Thierry, 2017, in the context of picture naming). If this differential suppression account is correct, then both negative and neutral words may have led to the inhibition of L1 when cues were in L2, with L2 inhibition being weaker for neutral than negative words when cues were in L1. Even though this leads to a slightly different interpretation of the results, this account remains consistent with greater affective sensitivity in L1 than L2.
Our findings suggest that highly proficient and immersed bilinguals operating in their second language may experience a reduction in sensitivity when anticipating affective content. This potential reduction, however, failed to yield a critical interaction in the current study unlike in previous studies showing that proficient bilinguals are less reactive to negative information presented in the second language and appear 'protected' against its negative impact. For instance, in the study by Wu and Thierry (2012), Chinese-English bilinguals did not show a predicted priming effect relating to unconscious non-selective lexical access from English to Chinese (Thierry and Wu, 2007;Wu and Thierry, 2010) when the English prime word had a negative valence. In a more recent study using emotionally realistic sentence contexts, access to negative information in the second language of immersed, proficient Polish-English bilinguals appeared to be reduced (Jończyk et al., 2016). Furthermore, converging observations based on pupil dilation (Iacozza et al., 2017) and electrodermal measurements (Jankowiak and Korpal, 2017; see also García-Palacios et al., 2018) lend support to the idea that individuals are more detached from emotionally charged information in the L2 (for a discussion of affective disembodiment in bilingualism, see Pavlenko, 2012;Jończyk, 2016c;Sheikh and Titone, 2016), at the same time providing neurophysiological support for findings reported in introspective and clinical bilingual contexts (see Pavlenko, 2012;Caldwell-Harris, 2015;Jończyk, 2016a).
Despite the lack of a significant language by valence interaction in our bilingual group, our findings within each language context are not inconsistent with the so-called 'foreign language effect' (Keysar et al., 2012;Costa et al., 2014a). When bilinguals operate in their second language, they have been shown to exhibit a more utilitarian behavior (Costa et al., 2014b;Geipel et al., 2015;Corey et al., 2017;Costa et al., 2017). Furthermore, late bilinguals are willing to take more risks when receiving positive feedback in their native language, that is a reduction of the hot-hand effect when they operate in their second language (Gao et al., 2015). The foreign language effect is thought to relate to the subjective impression of relative affective detachment when one operates in the second language, in line with hypotheses made on the basis of introspective approaches to bilingualism and emotion (for recent reviews, see Dewaele, 2010;Caldwell-Harris, 2015;Jończyk, 2016a) and bilinguals' autobiographical memory Rubin, 1998, 2000;Larsen et al., 2002;Marian and Kaushanskaya, 2008;Pavlenko, 2014).

Conclusion
In this study, we looked into the effects of language of operation on the anticipation of a forthcoming emotional picture in a classical S1-S2 priming paradigm. ERP data provided evidence regarding the feasibility of a neurophysiological investigation of affective anticipation in bilinguals, given that a reliable SPN modulation was found in both bilinguals and monolinguals when they operated in their native language. Furthermore, our results suggest that emotional experience in the second language may be reduced in power, although this would require validation based on a within-subject interaction between affective valence and language of operation. Given the fact that participants in our study were very fluent in English and immersed in the L2 culture for a significant period of time, we believe that this difference may be more pronounced in less experienced bilinguals. Also, increasing the potency of emotional stimuli may lead to a significant within subject effect, but we are conscious that this may come at the cost of ecological validity.

Code availability
All MATLAB code used for the pre-processing of the data is available at Open Science Framework (OSF), https://osf.io/tdzsk/.

Data availability
The data collected and analyzed for the purpose of the current study is available from the corresponding author upon reasonable request.