Abstract

Foreign language (FL) knowledge has been shown to contribute significantly to FL reading performance. Studies have contrasted the contribution of FL vocabulary and syntactic knowledge, following a dichotomous view of these components, producing mixed results. Despite the increasingly recognized formulaic nature of language, the contribution made by phraseological knowledge to reading ability has not been investigated systematically. This study examines the impact of a broader construct definition of linguistic knowledge—which includes a phraseological component—in explaining variance in reading performances. Test scores of 418 learners of English as a foreign language (EFL) were modeled in a structural equation model, showing that a phraseological knowledge measure outperformed traditional syntactic and vocabulary measures in predicting reading comprehension variance. Additional insights into the role of phraseological knowledge were gained through verbal protocol analysis of 15 EFL learners answering reading comprehension items that targeted the understanding of phrasal expressions within written context. The findings hint at an underestimated, but critical, role of phraseological knowledge in FL reading, and are relevant to both the assessment and the teaching of EFL ability.

INTRODUCTION

For people across the globe, being able to read in a foreign language (FL) is often instrumental to academic and professional success, as well as personal development, and ‘this is particularly true of English’ (Alderson 1984: 1). However, despite the wealth of research on the nature of reading, our insights remain incomplete, and English as a foreign language (EFL) reading development may not be optimally supported. Over the past fifty years, several reading models have been suggested, with two of the most prominent types being (i) process and (ii) component models. Process models include so-called top-down, bottom-up, and interactive approaches (see Grabe and Stoller 2011 for an overview). Componential approaches try to explain reading performance through components associated with lower- and higher-order reading processes. In general, fluent reading comprehension is assumed to involve automatized lower-level processes such as lexical access, syntactic parsing, and semantic proposition formation, and higher-level processes such as forming a text and situation model, comprehension monitoring and strategy use (Koda 2005; Grabe 2009). The latter group of models tries to model reading ability and identify ‘possible explanatory skill factors involved in the reading process, as opposed to explaining how those components operate in the process’ (Shiotsu and Weir 2007: 99). Although Hoover and Tunmer’s (1993: 4) statement that the components are ‘theoretically distinct and empirically isolable constituents’ has been questioned (Urquhart and Weir 1998) and presents problems for empirical investigations (Alderson and Kremmel 2013), component approaches to reading ability have proven to be fertile ground for many researchers over recent decades. Such a view of reading ability appears particularly relevant and beneficial for diagnostic assessment in terms of explaining developmental and individual differences. Jeon and Yamashita (2014: 161), for instance, assert that ‘[t]he approach is […] useful to language teachers and testers by helping them identify areas of individual ability differences and design effective intervention programs and tests’. In addition, it contributes to a more clarified understanding of the reading construct in general (Jeon and Yamashita 2014).

Virtually all theoretical component models of FL reading include vocabulary and structural knowledge in some form as key elements of reading ability. For example, Grabe (1991: 379) posited ‘vocabulary and structural knowledge’ as one of six components of fluent FL reading. In a more recent model, Grabe (2009) maintains vocabulary and syntactic knowledge as crucial components of FL reading comprehension, but lists them as two separate elements rather than one integrated unitary component as in his earlier model. This comparison of Grabe’s models illustrates that while scholars agree that these aspects of linguistic knowledge are key to FL reading comprehension, there are considerable differences in their conceptualizations. Most often, a dichotomous view of vocabulary and syntax is favored, assuming a separability of the elements. However, this has been problematized from several research perspectives. Guo and Roehrig (2011), for instance, established in a confirmatory factor analysis that vocabulary knowledge and syntactic awareness should be collapsed into a single ‘language’ factor, indicating an inseparability of these psycholinguistic constructs. In computational readability research, Vajjala and Meurers (2012) found that combining lexical and syntactic features improved readability classifications. Römer (2009), in a corpus linguistic approach, also states that vocabulary and syntax are inseparable (see Alderson and Kremmel 2013 for further discussion of this issue).

Nevertheless, the traditional view of vocabulary as form-meaning knowledge of individual words and syntax as the way ‘words are put together to form sentences’ (Sampson 1985: 38) has led to several studies into the relative significance of each of these components for FL reading ability. This body of research, however, has yielded inconclusive findings. Most studies identified a prevalence of vocabulary (e.g. Sternberg 1987; Hacquebord 1989; Bossers 1992; Brisbois 1995; Schoonen et al. 1998; Nassaji and Geva 1999; Yamashita 1999; Alderson 2000; Nassaji 2003; Brunfaut 2008), but some studies suggest that structural knowledge might be an equally, if not more, important predictor of successful FL reading (Droop and Verhoeven 2003; Van Gelderen et al. 2003, 2004; Shiotsu and Weir 2007; Nergis 2013). All of these studies, however, neglect the formulaic nature of language and fail to account for phraseological knowledge as a potentially influential contributor to reading ability.

Phraseological knowledge can be roughly defined as knowledge of formulaic sequences or multi-word expressions (MWEs). Although numerous definitions have been put forward for formulaic sequences (e.g. Wray 2002, 2008), most see them as ‘matching a single meaning or function to a form, although that form consists of multiple orthographic or phonological words’ (Martinez and Schmitt 2012: 299). They are combinations of words that co-occur more frequently than would be expected by chance and can take very different forms. Siyanova-Chanturia and Martinez (2014: 1) loosely define them as ‘(semi) fixed, recurrent phrases, such as collocations (strong tea), binomials (black and white), multi-word verbs (put up with), idioms (spill the beans), proverbs (better late than never), speech formulae (What’s up), lexical bundles (in the middle of), and other types’. The present article focuses on knowledge of phrasal expressions, that is, (semi)fixed sequences of ‘two or more co-occurring but not necessarily contiguous words with a cohesive meaning or function that is not easily discernible by decoding the individual words alone’ (Martinez and Schmitt 2012: 304).

Conklin and Schmitt (2012: 46), surveying the literature, suggest that formulaic language ‘makes up between one third and one half of discourse’ (see also Nattinger and DeCarrico 1992; Biber et al. 1999; Erman and Warren 2000; Oppenheim 2000). In addition, Martinez and Schmitt (2012: 299) have pointed out that ‘research has now established that [formulaic language] is fundamental to the way language is used, processed, and acquired in both the L1 and L2’. With reference to reading, Martinez and Murphy (2011) have suggested that, since formulaic language is ubiquitous, knowledge of MWEs might contribute significantly to reading comprehension. They were surprised, though, to find little information about the potential role of formulaic language for EFL reading comprehension, especially ‘considering the relative wealth of research and literature on L2 reading comprehension and, separately, multi-word expressions in English’ (Martinez and Murphy 2011: 273).

Although there have been a number of studies investigating the processing of MWEs, particularly of idioms, in reading contexts (Conklin and Schmitt 2008, 2012; Siyanova-Chanturia et al. 2011a, 2011b; Tremblay et al. 2011), these studies have primarily aimed at explaining the storing and retrieval of MWEs in and from the lexicon, and the processing (speed) advantages that phraseological knowledge entails. There is still a gap in research on how MWEs affect reading comprehension (Martinez 2013). Martinez and Murphy (2011) demonstrate convincingly that comprehension analyses that operate on the level of individual words only, for example in coverage research, fall short of a full representation of comprehension difficulties, particularly in FL contexts. They tested 101 Brazilian EFL learners’ comprehension of two texts containing identical high-frequency words, but in one of the conditions, these words were arranged into MWEs. They found ‘that learners’ comprehension not only decreased significantly when multiword expressions were present in text but students also tended to overestimate how much they understood as a function of expressions that either went unnoticed or were misunderstood’ (Martinez and Murphy 2011: 267). Their findings are a first indication that phraseological knowledge deserves more attention in reading comprehension research and should thus also be incorporated in component models as a potentially important predictor variable. However, their study primarily investigated the impact of idiomaticity on comprehensibility.

The present article attempts to draw attention to the oft-neglected role of formulaic sequences as part of linguistic knowledge in reading ability by exploring whether FL readers’ phraseological knowledge contributes to explaining FL reading proficiency over and above more traditional conceptualizations of linguistic knowledge. It also aims to gain insights into how FL readers process MWEs when completing FL reading tasks. Therefore, the following two research questions were formulated:

RQ1. Does a broader construct definition of syntactic and vocabulary knowledge including phraseological knowledge provide useful information for the prediction of EFL reading performance?

RQ2. How do advanced EFL readers make use of MWEs in reading?

RESEARCH DESIGN

To address these two research questions, two studies were conducted. In the first study, the relative impact of syntactic, vocabulary, and phraseological knowledge (i.e. the independent variables) on FL reading comprehension (i.e. the dependent variable) was examined by means of structural equation modeling (SEM) (RQ1). The second study was more qualitative in nature, and explored the processing of MWEs by means of think-aloud protocols (RQ2). Below, the methodology and results of Study 1 will be presented first, followed by the methodology and results of Study 2.

Study 1: The impact of syntactic, vocabulary, and phraseological knowledge on FL reading

Participants

The participants were 418 Austrian EFL learners from eight schools in six different provinces, in their penultimate year of secondary education. Fifty-seven percent were female vs. 35 percent male (8 percent did not indicate). The mean age was 16.9 years, and the majority (86 percent) had a German-L1 background. According to the Austrian national curriculum (BMUKK 2004), the stipulated proficiency target of learners at this level of education is level B2 of the Common European Framework of Reference (CEFR) (Council of Europe 2001).

Instruments

The dependent variable (EFL reading) and the independent variables (syntactic, vocabulary, and phraseological knowledge) were operationalized by means of four different tests.

Reading comprehension measure

The reading measure comprised four EFL reading tasks sampled from previously administered reading tests of the Austrian EFL school-leaving examination. These tasks had been developed on the basis of CEFR-linked test specifications (SRP 2009), piloted and standard set at CEFR B2 level. The test consisted of authentic reading texts, ranging from 461 to 653 words in length, with a different test format for each text (multiple choice, note form, and two types of multiple matching). The 33 reading items were judged by the item writers and item moderators to be assessing the ‘understanding of main ideas and supporting details’ (Green 2000).

Syntactic knowledge measure

Participants’ syntactic knowledge was measured by Shiotsu’s (2010) test of syntactic knowledge. It consists of 32 multiple-choice items that present four semantically similar options for a gap in a sentence, only one of which fits syntactically. A sample item is:

We found………………. to understand his lecture.

□ difficulty□ difficult□ so difficult□ it difficult

Vocabulary knowledge measure

As a vocabulary knowledge measure, the DIALANG Advanced Vocabulary Test was chosen, which embeds vocabulary items in a context and uses both selected- and constructed-response formats. The 30 items in the test cover four aspects of word knowledge: form-meaning link knowledge, collocational knowledge, derivational knowledge, and association knowledge (Alderson 2005). A sample item is:

What is the best word for the gap in the sentence. Write it in the box. The word begins with an ‘s’.

Angela got the job as she was clearly…………………………… to the other candidates.

Phraseological knowledge measure

Martinez’ (2011) Test of MWEs was the most suitable operationalization of the construct ‘phraseological knowledge’ for the purpose of this study (see definition in the Introduction). The version of this four-option multiple-choice test used in this study consisted of 60 items based on Martinez and Schmitt’s (2012) PHRASE list, collated from the British National Corpus. This list, from which the items were sampled, was designed adhering to the principles of ‘high frequency, meaningfulness and relative non-compositionality’ (Martinez and Schmitt 2012: 304). The 60 items were sampled in equal amounts from the first five 1,000-word bands of the most frequent word families. A sample item is:

at all:I don’t like it at all.   a. all the time   b. in any way   c. at first   d. sometimes

According to Nattinger and DeCarrico’s (1992) definition of phrasal expressions, the construct of Martinez’ test represents the middle ground between the two extreme ends of the vocabulary-syntax cline. Including a measure like this in a model of reading is therefore an attempt to operationalize a fuller representation of linguistic knowledge when estimating the contribution vocabulary and structural knowledge make to reading ability, as it goes beyond the traditional dichotomy by taking phraseological MWEs into account.

Procedures

After having piloted all materials and procedures, the tests were administered in one sitting at each school in a randomized order to minimize sequence effects. Participants had 60 min to complete the reading test and 20 min for each of the other test papers.

Analyses and results

Descriptive statistics and reliability values for all four instruments can be found in Table 1. Two participants’ data were removed from the data set, since one had not completed one of the instruments, and the other was identified as an outlier.1

Table 1:

Study 1—Descriptive statistics (n = 416)

Test Minimum Maximum Mean SD Cronbach’s α 
Reading 10 33 26.37 4.91 .83 
Syntax 32 23.50 7.73 .93 
Vocabulary 28 15.85 5.52 .82 
Phrase 26 60 46.67 7.56 .87 
Test Minimum Maximum Mean SD Cronbach’s α 
Reading 10 33 26.37 4.91 .83 
Syntax 32 23.50 7.73 .93 
Vocabulary 28 15.85 5.52 .82 
Phrase 26 60 46.67 7.56 .87 
Table 1:

Study 1—Descriptive statistics (n = 416)

Test Minimum Maximum Mean SD Cronbach’s α 
Reading 10 33 26.37 4.91 .83 
Syntax 32 23.50 7.73 .93 
Vocabulary 28 15.85 5.52 .82 
Phrase 26 60 46.67 7.56 .87 
Test Minimum Maximum Mean SD Cronbach’s α 
Reading 10 33 26.37 4.91 .83 
Syntax 32 23.50 7.73 .93 
Vocabulary 28 15.85 5.52 .82 
Phrase 26 60 46.67 7.56 .87 

To investigate the predictive value for EFL reading of (a) syntactic knowledge, (b) vocabulary knowledge, and (c) phraseological knowledge (RQ1) and their relative contributions, SEM was used. Since SEM requires at least two observed variables for each latent variable, the individual tests were split into random halves using Gulliksen’s (1950) Matched Random Subtest method. The model in Figure 1 was evaluated using the software AMOS (Maximum Likelihood method), and following Raykov and Marcoulides (2000) and Schumacker and Lomax (1996), who stipulate that a model is acceptable if the chi-square test is nonsignificant, the chi-square per degree of freedom is below 2, the goodness-of-fit index (GFI), adjusted goodness-of-fit index (AGFI), normed fit index (NFI), comparative fit index (CFI), and Tucker-Lewis index (TLI) are above 0.9, and the root mean square error of approximation (RMSEA) value is below 0.5. The model fit indices (see Table 2) meet all these criterion values. The data also had a multivariate normal distribution (Kline 2011).

Table 2:

Study 1—Estimates of model-to-data fit for model

χ2 χ2/df TLI CFI NFI GFI AGFI RMSEA 
15.721 1.123 0.999 0.999 0.994 0.991 0.976 0.017 
χ2 χ2/df TLI CFI NFI GFI AGFI RMSEA 
15.721 1.123 0.999 0.999 0.994 0.991 0.976 0.017 
Table 2:

Study 1—Estimates of model-to-data fit for model

χ2 χ2/df TLI CFI NFI GFI AGFI RMSEA 
15.721 1.123 0.999 0.999 0.994 0.991 0.976 0.017 
χ2 χ2/df TLI CFI NFI GFI AGFI RMSEA 
15.721 1.123 0.999 0.999 0.994 0.991 0.976 0.017 

Figure 1:

Study 1—Componential model of FL reading

Figure 1:

Study 1—Componential model of FL reading

The model explains 75 percent of the variance in EFL reading test scores (see Table 3). The results seem to support a broader construct definition of linguistic knowledge and an incorporation of knowledge of MWEs as a separate latent variable because the model improves from explaining only 69 percent of the variance when including only the vocabulary and the syntax measure as predictor variables. Importantly, knowledge of phrasal expressions emerged as the strongest contributor to FL reading test performance (β = .57). FL vocabulary knowledge made the second biggest contribution (β = .29). Although FL syntactic knowledge correlated moderately with FL reading test performance (r = .39*), it did not explain variance (β = .06) beyond that explained by phraseological and vocabulary knowledge.

Table 3:

Study 1—Regression and correlation summary

 Reading × Syntax Reading × Vocab. Reading × Phrasal expr. Syntax × Vocab. Syntax × Phrasal expr. Vocab. × Phrasal expr. 
β .06 .29* .57* – – – 
r .39* .83* .85* .45* .36* .90* 
percent explained 15 69 72 20 13 81 
percent jointly explained  75  – – – 
 Reading × Syntax Reading × Vocab. Reading × Phrasal expr. Syntax × Vocab. Syntax × Phrasal expr. Vocab. × Phrasal expr. 
β .06 .29* .57* – – – 
r .39* .83* .85* .45* .36* .90* 
percent explained 15 69 72 20 13 81 
percent jointly explained  75  – – – 

Note: *p < .05.

Table 3:

Study 1—Regression and correlation summary

 Reading × Syntax Reading × Vocab. Reading × Phrasal expr. Syntax × Vocab. Syntax × Phrasal expr. Vocab. × Phrasal expr. 
β .06 .29* .57* – – – 
r .39* .83* .85* .45* .36* .90* 
percent explained 15 69 72 20 13 81 
percent jointly explained  75  – – – 
 Reading × Syntax Reading × Vocab. Reading × Phrasal expr. Syntax × Vocab. Syntax × Phrasal expr. Vocab. × Phrasal expr. 
β .06 .29* .57* – – – 
r .39* .83* .85* .45* .36* .90* 
percent explained 15 69 72 20 13 81 
percent jointly explained  75  – – – 

Note: *p < .05.

The strong covariance between the phraseological knowledge factor and both the vocabulary and syntactic components suggests that phrasal expressions are partly lexical in nature, but also involve at least some structural or grammatical elements.2 However, the correlation between vocabulary and phraseological knowledge is much stronger than between syntactic and phraseological knowledge (r = .90* vs. r = .36*). This might indicate that knowledge of such multi-word chunks is predominantly lexical when placed on the lexicogrammar continuum (Sinclair 2004). It should be noted, though, that although correlating highly with vocabulary knowledge, phraseological knowledge does not absorb the predictive power of the vocabulary measure. It thus seems justified to some extent to postulate phraseological knowledge as a latent variable that is not subordinate to either vocabulary or syntactic knowledge.

Study 2: MWE processing in the context of FL reading comprehension

Following up on the findings of Study 1, which suggest that knowledge of MWEs is indeed crucial to FL reading comprehension, a more qualitative study was undertaken to gain insights into how learners process and make use of these phrases whilst trying to comprehend what they read in the FL (RQ2). For this purpose, a group of EFL learners was asked to think aloud whilst completing an EFL reading test comprising texts which included MWEs in those parts of the text that are crucial to item completion. Despite the risks of veridicality and reactivity (inaccurate reporting and alterations of thought processes due to talking out loud), concurrent think-alouds have been shown to be able to reveal readers’ cognitive and strategic processing or their reasoning behind reading test answers, given careful research design and data interpretations (Green 1998; Bowles 2010). Since the present study constituted a first exploration of the role of MWEs in learners’ processing of written texts, this method was considered suitable and satisfactory for our purposes, whilst adhering to Bowles’ (2010) recommendations for data collection and analyses to minimize any disadvantages associated with think-alouds.

In addition, learners’ phraseological knowledge was controlled for with a MWEs test.

Participants

The participants were 15 Austrian EFL learners who were sampled from the same population as in Study 1, but had not taken part in the first study. They were therefore again L1-German EFL learners in their penultimate year of secondary education, expected to be at CEFR B2 according to the national curriculum. Ten participants were female, five were male, and they were on average 17 years old.

Instruments

Phraseological knowledge measure

The participants’ knowledge of MWEs in isolation was determined by means of a version of Martinez’ (2011) test of MWEs, described above in Study 1.

Reading comprehension measure

To investigate whether and how FL learners process and make use of MWEs in order to achieve textual comprehension, an EFL reading test was developed, which tested information from the text for which MWEs were essential for comprehension. The test consisted of two texts, 18 items, and two example items, and was developed according to the following procedure. First, a set of MWEs was selected for embedding in the reading test. In practice, these were the 60 target MWEs of Martinez’ test of MWEs. To distribute these over two texts, the MWE list was split into half, retaining a balance in the representation of all frequency levels in each half. Two texts of 615 and 581 words respectively (‘The Artist’ and ‘Great Brit Boys Bake’) were then written by an experienced item writer, each containing one of these two sets of MWEs. For each text, nine reading comprehension items (and one example item) were developed by the researchers in the format True/False/Justification, a response format employed in the Austrian national school-leaving exam and thus familiar to the participants. In this format, learners are presented with statements on the contents of the text and have to decide whether each of the statements is either ‘True’ or ‘False’. In order to be awarded the point for a correct answer, however, learners also need to cite the first four words of the sentence in the text which they identify as containing the relevant justification for their decision. The texts and items are available in the online Supplementary material.

The items were intended to target the comprehension of the MWEs in written context. It should be emphasized that while comprehension of the MWEs was deemed very important or indeed necessary to arrive at the correct answer, care was taken in the design of the materials that this was not a vocabulary test but a reading test. To ensure that understanding these phrases was indeed crucial to answering the reading items, seven expert judges, all with at least 3 years of experience in language testing and a degree in language testing, Teaching English as a Foreign Language (TEFL) or English language studies, were asked to judge on a Likert scale from 0 to 3 how important they thought understanding particular phrases was to answering each item correctly. Inter-rater reliability was high, with an average correlation of .91. Only MWEs with a mean of 1.5 or higher on the 3-point Likert scale were examined in the think-aloud analyses. This resulted in one item being dropped, as the targeted MWE was not considered sufficiently important to correct item completion by the expert judges to warrant inclusion in the analyses. In addition, the judges’ data were used to narrow down the analysis targets. While the targeted justification sentences might have contained several MWEs from the set, the judges identified only one relevant MWE per reading item in 94 percent of the cases. This means that only for one reading item, were two MWEs judged to be of importance to answering the item correctly.

Procedures

In an initial pilot stage, four participants were asked to complete the MWE and the reading tests, and provide verbal protocols while completing the reading tasks. Since no changes appeared to be necessary to the materials or procedures after the trial, these participants’ data were retained for the main study.

All participants were presented with the materials in the following order. First, they were given a True/False/Justification task designed for the Austrian school-leaving exam as a warm-up and asked to think-aloud in whichever language(s) they felt most comfortable. When the researcher felt that the participants had understood the verbal protocol procedure, they were given the reading tasks containing the MWEs and asked to think aloud. Participants took about 40 min on average to complete the tasks whilst thinking aloud, and the verbal protocols were audio-recorded. To minimize sequence effects in the sample, the presentation order of the two reading tasks was randomized. After having completed the reading test, the participants were asked to take the MWE test, which presents the very same target MWEs in a discrete, selected-response format with minimal, nondefining contextualization. Administering this measure alongside the reading tasks allowed for comparisons between learners’ understanding of the MWEs in isolation vs. their understanding of the relevant MWEs in a reading context.

Analyses and results

The verbal protocols of the reading tasks were transcribed (and translated if not in English), and then coded by the first author using NVivo 10. Of the transcripts, 27 percent (four randomly chosen protocols on both tasks) were double-coded by the second author, resulting in an inter-rater agreement of 95 percent. Coding nodes were established in an exploratory fashion as they emerged from the verbal protocols (Dörnyei 2007). Participants’ processing of the MWEs in the text was coded as:

  • read aloud or reread.

  • mentioned, that is, with no further elaboration.

  • elaborated on, that is, explicitly related to a paraphrase or synonym of the MWE.

  • paraphrased, but without explicit reference to or reproduction of the target MWE.

  • implied, that is, no explicit reference of the target MWE, but implicitly showing use of its meaning in answering the item.

  • ignored, that is, although the MWE may have been read aloud, its relevance to answering the item is overlooked.

Overall results

Since the main aim of this qualitative part of our research was to gain initial insights into how learners process MWEs whilst reading for comprehension, we will now show examples of each of the types of processing evidenced in our data set, that is, whether the learners read aloud (Process A), mentioned (Process B), elaborated on (Process C), paraphrased (Process D), implied (Process E), or ignored (Process F) the MWE.

However, at the same time, we found that certain processes seemed to be associated with certain outcomes. It could be argued that the demonstration of knowledge or understanding of a MWE in isolation in the discrete measure (the phraseological knowledge test) is most likely to result in correct answers to the reading item targeting that particular MWE in context. A comparison of performances on the MWE test and the reading test (see Table 4) shows that the scores on the two measures indeed ‘match’ in the majority of the cases (70 percent). Table 4 indicates that most learners answered the discrete item correctly as well as the corresponding reading item [cell (a) in Table 4]. However in one third of cases, there was a mismatch between getting the discrete item correct and answering the reading item correctly [cells (b) and (c) in Table 4]. Therefore, we will illustrate three of these outcomes [cells (a), (b), and (c)] with MWE processing examples from our verbal protocol data. We would like to emphasize, however, that we are not claiming a cause–effect relationship on the basis of our study; rather, these are impressions of tendencies.

  • MWE correct and reading item correct

 In most cases, candidates’ answers to both the reading item and the relevant MWE item were correct. Participant 6, for instance, made explicit use of her knowledge of the targeted MWEs. The True/False/Justification statement of Item 5 of the reading task ‘The Artist’, for example, reads:

Item

Roy only copied paintings by two famous artists.

The answer to the item can be found in the following passage of the text:

Text

This, of course, gives rise to much shaking of heads among art experts. After all, the whole point of art is to be creative and original. Even so, Roy was proud of his ability to reproduce Picassos, Van Goghs and so on so accurately they seemed like the real thing.

Using her knowledge of the relevant MWE (‘so on’) explicitly, Participant 6 read aloud (Process A) and elaborated on the MWE, explaining her reasoning in order to successfully arrive at the correct answer (Process C):

Think-aloud

‘Roy only copied paintings by two famous artists.

OK, in the text it says,

“Even so, Roy was proud of his ability to reproduce Picassos, Van Goghs and so on”

So that’s false, because “so on” means something more and that means that he did not only copy paintings by two famous artists.’

This instance seems to provide evidence that these MWEs are treated as meaning units or phraseological chunks as Participant 6 states that ‘so on’ means ‘something more’. In another instance, the same learner in the same task demonstrated how knowing the discrete MWE can aid in answering the reading item correctly. In Item 9 of the task ‘The Artist’, the True/False/Justification statement that had to be judged reads:

Item

Roy produced modern paintings because he could thereby develop his artistic techniques.

This item targeted the understanding of the MWE ‘whether or not’ in the following paragraph of the reading text:

Text

Personally, he didn’t much care for contemporary art, but in light of the popularity of minimalist and abstract subjects, especially with people who lived in modern houses, he found himself doing a lot of canvases either with a single black dot in the corner or else covered in squiggles. But at least the modern works were quick and easy to do. Whether or not it allowed him to improve his brushwork was beside the pointmodern art was lucrative. As a result Roy was soon earning very well indeed.

This verbal protocol of Participant 6 shows a different way of making use of discrete MWE knowledge. In this instance, she first read (Process A), mentioned (Process B), and then paraphrased the target phrase (Process D):

Think-aloud

‘“Roy produced modern paintings because he could thereby develop his artistic techniques.”

OK, and in the text it says, “whether or not it allowed him to improve his brushwork was beside the point.”

OK, I think the sentence is a little bit confusing, whether or not means something, maybe means something like, well, I don’t know a German translation, yeah, I know what it means but I can’t explain it. “Whether or not it allowed him to improve his brushwork was beside the pointmodern art was lucrative.” Oh, Whether or not it allowed him to improve his brushwork was beside the point. So, whether or not it allowed him to improve his brushwork, that wasnt important, it was lucrative. So I think question number nine is false, because it was not important, if his brushwork was improved or not, it was just lucrative.’

Table 4:

Study 2—Cross-tabulation of MWE knowledge in phraseological knowledge test and in reading item targeting the MWE

 Reading item correct Reading item incorrect 
MWE item correct 66 percent (a) 17 percent (b) 
MWE item incorrect 13 percent (c) 4 percent (d) 
 Reading item correct Reading item incorrect 
MWE item correct 66 percent (a) 17 percent (b) 
MWE item incorrect 13 percent (c) 4 percent (d) 
Table 4:

Study 2—Cross-tabulation of MWE knowledge in phraseological knowledge test and in reading item targeting the MWE

 Reading item correct Reading item incorrect 
MWE item correct 66 percent (a) 17 percent (b) 
MWE item incorrect 13 percent (c) 4 percent (d) 
 Reading item correct Reading item incorrect 
MWE item correct 66 percent (a) 17 percent (b) 
MWE item incorrect 13 percent (c) 4 percent (d) 

Other participants seemed to make implicit use of their MWE knowledge. In the protocols, this can be seen when they drop the relevant phrase in answering the items. Participant 11, for instance, when answering Item 2 of the task ‘Great Brit Boys Bake’, did not explicitly refer to the phrase ‘have to’ while showing understanding of it in establishing the answer (Process E). The relevant passage for the item reads:

Text

It would appear that originality is not necessarily the key to captivating a TV audience. Great Brit Boys Bake, a cooking competition hosted by a hip young celebrity chef, is by no means a novel concept. But GBBB is just a bit different: it’s for boys onlyentrants have to be male and maximum 24 years oldand it’s boys going sweet on us, creating exquisite confections of all kinds.

The corresponding True/False/Justification statement (In order to participate, a candidate must meet two requirements) was answered by Participant 11 as follows:

Think-aloud

‘“In order to participate, a candidate must meet two requirements.”

Yeah, that’s true, because, it would appear, no, Great Brit Boys Bake, no, there it is

But GBBB, erm, well, entrants maximum 24 years old and male… erm, well, I think I can count that as one sentence.’

  • (b) MWE correct but reading item incorrect

In some cases, students answered the reading item incorrectly but managed to correctly complete the discrete MWE test item. Closer analysis of the verbal protocols of these mismatching cases showed that most learners seemed to simply ignore the relevant MWE. An example of this was Item 9 in the task ‘The Artist’, which 8 out of 15 participants answered incorrectly. They appeared to overlook the relevance of the MWE (Process F), even though they got the MWE correct in the discrete measure. Participant 3’s verbal protocol illustrates this:

Think-aloud

‘“Roy produced modern paintings because he could thereby develop his artistic techniques.”

“Whether or not it allowed him to improve his brushwork was beside the pointmodern art was lucrative. As a result Roy was soon earning very well indeed.”

Roy produced modern paintings

Im going to have to read that again.

“Because he could thereby develop his artistic techniques.”

Erm, I dont think that he… the techniques, well the first sentence only says that it is quick and easy to do, but then it also says… it allowed him to improve his brushwork, but that was only a minor point. “Modern art was lucrative.”

I think, thats false, because it doesnt really tell me that it did improve his, thats

No, thereby, could thereby develop, then its true after all, because he can at the same time improve his brushwork. So, this is true.’

After having read the entire text to start with (and having answered preceding items), Participant 3 starts by reading the item before returning to the text. Her struggle with this passage is shown by the fact that she has to reread it (Process A) before attempting an answer. She then goes on to paraphrase what she has read, stating that improving the brushwork ‘was only a minor point’ for the artist. However, she then reconsiders, ignoring the MWE that would help her arrive at the correct answer (Process F). Similar processes were found for the other participants answering this reading item incorrectly. The problem thus does not seem to lie in a misunderstanding of the phrase, but a lack of attention paid to the meaning it carries that might be relevant to the answer. The position of this MWE at the beginning of the relevant sentence might have further contributed to it being overlooked. Participants might have mistaken the phrase as a semantically relatively irrelevant wh-question and therefore did not pay attention to the chunk.3
  • (c) MWE incorrect but reading item correct

In a few cases, the participants answered the reading item correctly, but answered the corresponding discrete MWE test item incorrectly. In several of these instances, the learners paraphrased the relevant contextualized MWE (Process D). This may indicate that, at least for some learners, contextualization aids understanding of the MWE which was not understood in isolation. An example illustrating this phenomenon is seen in Item 8 in the task ‘Great Brit Boys Bake’. A third of the participants answered this reading item correctly. The co-text appears to have helped them understand the relevant MWE, even though they did not show understanding of the MWE in the phraseological knowledge test. The item reads:

Item

The author assumed that his son would attend the baking course offered at school.

It targets the understanding of the MWE ‘take it for granted’ in the following paragraph of the reading text:

Text

When I by chance found out that, as of next term, an enterprising teacher at my son’s school was offering an optional class in baking, I took it for granted that Josh, my macho 17-year-old, would be choosing circuit training instead. The idea of him ever being interested in anything to do with baking apart from gorging himself on his mother’s, the idea of him ever weighing and measuring anything other than his own muscular physiquewell, the chances were simply too remote.

Participant 5’s verbal protocol shows that she had no difficulty understanding the phrase in context:

Think-aloud

‘Ok, then Q8 is “The author assumed that his son would attend the baking course offered at school.”

His son will attend a baking course, if there was one in the school.

“When I by chance found out that, as of next term, an enterprising teacher at my son’s school was offering an optional class in baking, I took it for granted that Josh, my macho 17-year-old, would be choosing circuit training instead.”

So, he thinks that his son would rather do cycle training than cooking and thats why this doesnt match up with the statement and the justification is

“When I by chance…”‘

Participant 5 appeared to read the MWE with no comprehension problems whatsoever, despite not having demonstrated knowledge of the relevant phrase in the phraseological knowledge test. This happened surprisingly frequently with the MWE in this specific reading item (Item 8), which could indicate that background knowledge or other elements from the co-text could have helped in understanding and answering this item. In fact, such mismatches did not occur equally distributed across all reading items. Almost half of all mismatches are accounted for by only 4 of the 18 reading items. While this might be taken as an indication of the quality of these items, it could also be that contextualization of MWEs in these four items is particularly facilitating or hindering of comprehension. But, in general, the verbal protocols of the mismatches suggest that students who know the meaning of an MWE in isolation may still struggle to understand it or simply ignore it when they encounter the MWE in context. In other cases, however, the co-text seems to help understand MWEs that might not be understood in isolation.

DISCUSSION

Our first research question was: ‘Does a broader construct definition of syntactic and vocabulary knowledge including phraseological knowledge provide useful information in the prediction of EFL reading performance?’ Study 1 showed that including a measure of MWE knowledge increased the amount of explained variance in the reading test scores considerably (from 69 percent to 75 percent). In addition, the phraseological knowledge measure outperformed the traditional vocabulary and syntax measures as predictors of reading performance. This result was not due to an overlap of phrases targeted in the MWE test and the texts of the reading measure: only 13 targets from the MWE test also occurred in the total 2,378 words of text of the reading measure of Study 1, so the overlap was negligible.

The strong interrelations between the vocabulary knowledge measure and the phraseological measure (r = .90*) suggest that MWEs are primarily lexical in nature rather than syntactic, thus tending toward the lexical end of a lexicogrammar continuum (Sinclair 2004). It could therefore mean that these expressions operate somewhat similarly to individual words, in that they are part of vocabulary knowledge as discrete meaning units, at least for these quite proficient EFL learners. The think-aloud data of Study 2 also provided some evidence that if participants elaborated on the MWEs, these MWEs were being parsed as chunks rather than as individual words (see, e.g. Participant 6 above). Indeed, in the verbal protocol data, none of the participants attempted to decode an MWE by analyzing and combining the literal meanings of its component parts.

The strong relationship between the vocabulary measure and the phraseology measure as well as between the vocabulary measure and the reading measure could, to a certain extent, be due to the nature and design of the vocabulary measure. It tested vocabulary items in embedded contexts and also tested more than merely form-meaning link knowledge (e.g. collocations). Thus, the operationalization of the construct in this test seems closer to a reading measure or a phraseology test, respectively, than a discrete vocabulary measure purely focusing on the form-meaning link of individual words.

In sum, however, the fact that the inclusion of the phraseological knowledge measure not only increased the overall variance explained, but also emerged as the strongest contributor, suggests two conclusions. First, the incorporation of phraseological knowledge into the component ‘vocabulary and structural knowledge’ as conceptualized in reading research (Grabe 1991), is necessary to ensure a full representation of the construct of linguistic knowledge. Secondly, the findings suggest that understanding of MWEs might be more relevant to fluent reading than estimated so far. While we still question whether the components vocabulary knowledge and syntactic knowledge are indeed ‘theoretically distinct and empirically isolable constituents’ (Hoover and Tunmer 1993: 4), particularly in light of the high intercorrelations between component parts in this and other studies (e.g. Brunfaut 2008; Shiotsu 2010), the findings certainly substantiate the claim that conceptualizations and studies that do not take phraseological knowledge into account fail to provide a comprehensive picture of the significance of linguistic knowledge for reading.

Following from our finding that phraseological knowledge contributes importantly to EFL reading test comprehension, we sought to understand how exactly advanced EFL readers make use of these expressions in reading (RQ2). The examination of 15 verbal protocols in Study 2 showed that, overall, knowing the MWE in isolation coincides with being able to answer the reading comprehension item correctly. Although the influence of other factors involved in reading cannot be ruled out, participants appeared to be able to make use of the MWE in a reading context for comprehension purposes, if they knew the MWE. Moreover, participants used their MWE knowledge in different ways—paraphrasing, explicitly elaborating on, implicitly using, or even ignoring phrasal expressions in context. Unsurprisingly, the more attention learners paid to the MWEs, the more likely it was that they arrived at the correct answer. Attempts to elaborate and/or paraphrase almost always coincided with successful answering of the reading item. Evidence in the think-alouds for ignoring the MWE or failing to recognize its importance to the item answer, was likely to co-occur with an unsuccessful attempt to answering the reading item.

Some data deviated from these findings, however, in that the contextualization of MWEs seemed to hinder comprehension in a minority of the cases. However, in almost as many cases, the reading context appeared to facilitate understanding of the MWE, as indicated by seemingly effortless paraphrases of the MWEs in many participants’ verbal protocols. The main reason for participants’ failure to answer reading comprehension items correctly, despite demonstrating an understanding of the relevant MWE in isolation, was that they simply seemed to ignore the MWEs in such instances. This echoes Martinez and Murphy’s (2011) finding that MWEs frequently go unnoticed with EFL readers. However, although our use of verbal protocols showed instances of overlooking or ignoring MWEs in reading, additional insights into how learners deal with MWEs in reading could potentially be gained from eye-tracking or other psycholinguistic methodologies. Furthermore, additional research—with a larger sample size—is recommended to assess the generalizability of Study 2’s initial, qualitative exploration of MWE processing in EFL reading.

The results of our research might be interpreted that for advanced EFL readers, who were the sole focus of this study, phraseological knowledge is particularly relevant. Some information on how these relatively proficient readers deal with MWEs in reading has been obtained, but this needs to be probed further to be better understood. In addition, the role of knowledge of MWEs in reading at lower proficiency levels still needs to be investigated. The current study could, however, be a first step in acknowledging the importance of this type of knowledge for successful reading, and could potentially also hint at it being similarly relevant for other skills areas (see, e.g. Brunfaut and Révész 2015 on the relationship between phrasal expressions and listening task difficulty). In any case, the finding of the importance of phraseological knowledge for reading comprehension reinforces Martinez and Murphy’s (2011: 274) claim that ‘multi-word expressions just may present a larger problem for reading comprehension than accounted for in the current literature’. It therefore seems justified to support Martinez and Schmitt’s (2012: 316) claim that there is a ‘need for a principled way to more systematically include formulaic sequences in L2 pedagogy’. It appears this claim not only pertains to EFL pedagogy, but also to EFL language testing. The present findings suggest that a greater awareness of the formulaic nature of language is needed among both teachers and test developers. Although it is beyond the scope of this exploratory study to suggest specific strategies for successful reading of MWEs or for implementation of these strategies into FL teaching, the study does raise the issue that MWEs need to be considered when screening and selecting materials for classroom instruction and/or item design. Raising awareness of this is even more important when considering that freely available text analysis tools, such as www.lextutor.ca (Cobb n.d.), which are frequently used by teachers and item writers for text selection, do not currently take phraseology into account. What is more, in terms of FL teaching, it seems reasonable to imagine that explicit teaching of MWEs might facilitate reading success in the same way that increasing a person’s vocabulary size has been established to help reading comprehension and reading test performance. However, further research would need to corroborate this hypothesis.

In addition, if phraseological knowledge is as important as our results seem to suggest, we clearly need more, and particularly more refined, tools and instruments to measure this. Phraseological knowledge needs to become incorporated into measurement tools of linguistic knowledge and into studies that investigate the relationship of linguistic knowledge and reading ability. This is particularly relevant in light of the increasing interest in diagnostic language testing (Alderson 2005; Jang 2009; Lee and Sawaki 2009; Alderson et al. 2015), for which instruments need to be developed that ‘target specific, atomistic aspects of language knowledge’ (Alderson et al. 2015: 22). As linguistic knowledge is crucial to reading ability, diagnostic tests of this component may enable tailoring language support and linguistic instruction for reading pedagogy more to the needs of the students in the classroom. We might thus have to recognize that some aspects of language knowledge are perhaps not as atomistic or discrete as ‘desirable’ for this purpose. In other words, we may wish to consider developing tests of lexicogrammar rather than ‘pure’ syntax or vocabulary tests, or integrating aspects of syntactic or phraseological properties of vocabulary into vocabulary tests. Even if opting for a ‘distinct-components’ approach, it is important to cast our nets wider and incorporate measurements of phraseological knowledge alongside traditional measures of vocabulary and syntax in the development of diagnostic test batteries. Further research is thus needed for a better understanding of both FL acquisition and the diagnosis of strengths and weaknesses in FL reading ability.

CONCLUSION

Two studies attempted to explore the role of phraseological knowledge in FL reading comprehension. Study 1 demonstrated quantitatively that knowledge of MWEs is a key component of linguistic knowledge and a better predictor of EFL reading variance than traditional vocabulary and syntax tests. This has theoretical and practical implications for the conceptualization and operationalization of the construct ‘linguistic knowledge’ in relation to FL reading. Study 2 provided initial insights into how EFL learners process and make use of such MWEs in reading by means of a qualitative analysis. The findings from participants’ verbal protocols complemented the methodology of the first study and demonstrated how participants elaborated and paraphrased MWEs during the process of understanding texts. The findings also suggested, however, that a considerable number of readers ignore crucial MWEs in context, which may lead to incorrect answers while reading comprehension items. Raising awareness of this problem and systematic focus on MWEs in language teaching are therefore suggested to help EFL learners.

These two studies, however, are only a first pointer to the important role that phraseological knowledge might play in FL reading. Further research could enhance models and understandings of FL reading comprehension for both the teaching and the testing of this skill. A clearer understanding of the relevance of this component for FL reading ability seems necessary, not only for a more sophisticated understanding of reading ability and thus clearer assessment constructs, but also for enhanced diagnostic techniques, as they could help to make reading pedagogy more tailored and efficient.

NOTES

1 A closer look at the data also showed that there was some variability between this participant’s scores on various instruments, with particularly low scores on the reading test. Since the participant had all answers to the last 10 reading items systematically wrong, this may be an indication that the participant did not put in effort or take the test seriously and just randomly picked answers to the selected-response items.
2 In addition, due to the inherently formulaic nature of language, it cannot be excluded that phraseological knowledge did not play any part at all in the constructs assessed by the syntactic and vocabulary knowledge measures.
3 A potential additional factor might be the difficulty of the item, in that a link needs to be made between ‘artistic techniques’ (item) and ‘brushwork’ (text). Possibly, some participants primarily focused on establishing this link, thereby overlooking other information in the text.

ACKNOWLEDGMENTS

We would like to thank Professor Norbert Schmitt (The University of Nottingham, UK) and the anonymous reviewers for their valuable feedback on our manuscript. This work was in part supported by the UK Economic and Social Research Council (grant number ES/H022872/1), and the Leverhulme Trust Emeritus Fellowship scheme (grant number EM-2012-030/1).

Conflict of interest statement. None declared.

SUPPLEMENTARY DATA

Supplementary material is available at Applied Linguistics online.

REFERENCES

Alderson
J. C.
1984
. ‘
Reading in a foreign language: A reading problem or a language problem
’ in
Alderson
J. C.
,
Urquhart
A. H.
(eds):
Reading in a Foreign Language
 .
Longman
, pp.
1
27
.
Alderson
J. C.
2000
.
Assessing Reading
 .
Cambridge University Press
.
Alderson
J. C.
2005
.
Diagnosing Foreign Language Proficiency: The Interface between Learning and Assessment
 .
Continuum
.
Alderson
J. C.
,
Kremmel
B.
.
2013
. ‘
Re-examining the content validation of a grammar test: The (im)possibility of distinguishing vocabulary and structural knowledge
,’
Language Testing
 
30
4
:
535
56
.
Alderson
J. C.
,
Brunfaut
T.
,
Harding
L.
.
2015
. ‘
Towards a theory of diagnosis in second and foreign language assessment: Insights from professional practice across diverse fields
,’
Applied Linguistics
 
36
2
:
236
60
.
Biber
D.
,
Johansson
S.
,
Leech
G.
,
Conrad
S.
,
Finegan
F.
.
1999
.
Longman Grammar of Spoken and Written English
 .
Longman
.
BMUKK [Bundesministerium für Unterricht, Kunst und Kultur]
.
2004
.
Oberstufenlehrplan für die Erste und Zweite Lebende Fremdsprache für Allgemein Bildende Höhere Schulen
 , .
Bossers
B.
1992
.
Reading in Two Languages: A Study of Reading Comprehension in Dutch as a Second Language and in Turkish as a First Language
 .
Dukkerij Van Driel
.
Bowles
M. E.
2010
.
The Think-aloud Controversy in Second Language Research
 .
Routledge
.
Brisbois
J. E.
1995
. ‘
Connections between first- and second-language reading
,’
Journal of Reading Behavior
 
27
4
:
565
84
.
Brunfaut
T.
2008
. ‘
Foreign language reading for academic purposes. Students of English (native speakers of Dutch) reading English academic texts
,’
PhD thesis, University of Antwerp
.
Brunfaut
T.
,
Révész
A.
.
2015
. ‘
The role of task and listener characteristics in second language listening
,’
TESOL Quarterly
 
49
1
:
141
68
.
Cobb
T.
(n.d.). Compleat lexical tutor v.8. Retrieved from www.lextutor.ca
.
Conklin
K.
,
Schmitt
N.
.
2008
. ‘
Formulaic sequences: Are they processed more quickly than nonformulaic language by native and non-native speakers?
,’
Applied Linguistics
 
29
1
:
72
89
.
Conklin
K.
,
Schmitt
N.
.
2012
. ‘
The processing of formulaic language
,’
Annual Review of Applied Linguistics
 
32
:
45
61
.
Council of Europe
.
2001
.
Common European Framework of Reference for Languages: Learning, Teaching, Assessment
 .
Cambridge University Press
.
Dörnyei
Z.
2007
.
Research Methods in Applied Linguistics
 .
Oxford University Press
.
Droop
M.
,
Verhoeven
L.
.
2003
. ‘
Language proficiency and reading ability in first- and second language learners
,’
Reading Research Quarterly
 
38
1
:
78
103
.
Erman
B.
,
Warren
B.
.
2000
. ‘
The idiom principle and the open choice principle
,’
Text
 
20
1
:
29
62
.
Grabe
W.
1991
. ‘
Current developments in second-language reading research
,’
TESOL Quarterly
 
25
3
:
375
406
.
Grabe
W.
2009
.
Reading in a Second Language: Moving from Theory to Practice
 .
Cambridge University Press
.
Grabe
W.
,
Stoller
F. L.
.
2011
.
Teaching and Researching Reading
 .
Longman/Pearson
.
Green
A.
1998
.
Verbal Protocol Analysis in Language Testing Research
 .
Cambridge University Press
.
Green
R.
2000
. ‘
An empirical investigation of the componentiality of EAP reading and EAP listening through language test data
,’
PhD thesis, The University of Reading
.
Gulliksen
H.
1950
.
Theory of Mental Tests
 .
John Wiley & Sons, Ltd
.
Guo
Y.
,
Roehrig
A. D.
.
2011
. ‘
Roles of general versus second language (L2) knowledge in L2 reading comprehension
,’
Reading in a Foreign Language
 
23
1
:
42
64
.
Hacquebord
H.
1989
.
Tekstbegrip van Turkse en Nederlandse Leerlingen in het Voortgezet Onderwijs
 .
De Gruyter Mouton
.
Hoover
W. A.
,
Tunmer
W. E.
.
1993
. ‘
The components of reading
’ in
Thompson
G. B.
,
Tunmer
W. E.
,
Nicholson
T.
(eds):
Reading Acquisition Processes
 .
Multilingual Matters
, pp.
1
19
.
Jang
E. E.
2009
. ‘
Demystifying a Q-matrix for making diagnostic inferences about SFL reading skills
,’
Language Assessment Quarterly
 
6
3
:
210
38
.
Jeon
E. H.
,
Yamashita
J.
.
2014
. ‘
L2 reading comprehension and its correlates: A meta-analysis
,’
Language Learning
 
64
1
:
160
212
.
Kline
R. B.
2011
.
Principles and Practice of Structural Equation Modeling
 ,
3rd edn
The Guilford Press
.
Koda
K.
2005
.
Insights into Second Language Reading: A Cross-Linguistic Approach
 .
Cambridge University Press
.
Lee
Y.-W.
,
Sawaki
Y.
.
2009
. ‘
Cognitive diagnosis approaches to language assessment. An overview
,’
Language Assessment Quarterly
 
6
3
:
172
89
.
Martinez
R.
2011
.
Putting a test of multiword expressions to a test
’.
Paper presented at the IATEFL Testing, Evaluation and Assessment Conference (TEA SIG), University of Innsbruck
.
Martinez
R.
2013
. ‘
How does (a lack of) knowledge of multiword expressions affect reading comprehension?
Paper presented at the AAAL Conference, Dallas, TX
.
Martinez
R.
,
Murphy
V. A.
.
2011
. ‘
Effect of frequency and idiomaticity on second language reading comprehension
,’
TESOL Quarterly
 
45
2
:
267
90
.
Martinez
R.
,
Schmitt
N.
.
2012
. ‘
A Phrasal Expressions List
,’
Applied Linguistics
 
33
3
:
299
320
.
Nassaji
H.
2003
. ‘
Higher-level and lower-level text processing skills in advanced ESL reading comprehension
,’
Modern Language Journal
 
87
2
:
261
76
.
Nassaji
H.
,
Geva
E.
.
1999
. ‘
The contribution of phonological and orthographic processing skills to adult ESL reading: Evidence from native speakers of Farsi
,’
Applied Psycholinguistics
 
20
2
:
241
67
.
Nattinger
J.
,
DeCarrico
J.
.
1992
.
Lexical Phrases and Language Teaching
 .
Oxford University Press
.
Nergis
A.
2013
. ‘
Exploring the factors that affect reading comprehension of EAP learners
,’
Journal of English for Academic Purposes
 
12
1
:
1
9
.
Oppenheim
N.
2000
. ‘
The importance of recurrent sequences for non-native speaker fluency and cognition
’ in
Riggenbach
H.
(ed.):
Perspectives on fluency
 .
University of Michigan Press
, pp.
220
40
.
Raykov
T.
,
Marcoulides
G. A.
.
2000
.
A First Course in Structural Equation Modeling
 .
Lawrence Erlbaum
.
Römer
U.
2009
. ‘
The inseparability of lexis and grammar: Corpus linguistic perspectives
,’
Annual Review of Cognitive Linguistics
 
7
:
140
62
.
Sampson
G.
1975
.
The Form of Language
 .
Weidenfeld and Nicolson
.
Schoonen
R.
,
Hulstijn
J.
,
Bossers
B.
.
1998
. ‘
Metacognitive and language-specific knowledge in native and foreign language reading comprehension: An empirical study among Dutch students in grades 6, 8 and 10
,’
Language Learning
 
48
1
:
71
106
.
Schumacker
R. E.
,
Lomax
R. G.
.
1996
.
A Beginner’s Guide to Structural Equation Modeling
 .
Lawrence Erlbaum
.
Shiotsu
T.
2010
.
Components of L2 Reading: Linguistic and Processing Factors in the Reading Test Performances of Japanese EFL Learners
 .
Cambridge University Press
.
Shiotsu
T.
,
Weir
C. J.
.
2007
. ‘
The relative significance of syntactic knowledge and vocabulary breadth in the prediction of reading comprehension test performance
,’
Language Testing
 
24
1
:
99
128
.
Sinclair
J.
2004
.
Trust the Text. Language, Corpus and Discourse
 .
Routledge
.
Siyanova-Chanturia
A.
,
Martinez
R.
.
2014
. ‘
The Idiom principle revisited
,’
Applied Linguistics
 ,
1
22
.
doi:10.1093/applin/amt054
Siyanova-Chanturia
A.
,
Conklin
K.
,
Schmitt
N.
.
2011a
. ‘
Adding more fuel to the fire: An eye-tracking study of idiom processing by native and nonnative speakers
,’
Second Language Research
 
27
2
:
251
72
.
Siyanova-Chanturia
A.
,
Conklin
K.
,
Heuven
W. Van
.
2011b
. ‘
Seeing a phrase ‘time and again’ matters: The role of phrasal frequency in the processing of multi-word sequences
,’
Journal of Experimental Psychology: Language, Memory, and Cognition
 
37
3
:
776
84
.
SRP [Standardisierte Reifeprüfung]
.
2009
.
B2 Reading Test Specifications
 , .
Sternberg
R. J.
1987
. ‘
Most vocabulary is learned from context
’ in
McKeown
M. G.
,
Curtis
M. E.
(eds):
The Nature of Vocabulary Acquisition
 .
Erlbaum
, pp.
89
103
.
Tremblay
A.
,
Derwing
B.
,
Libben
G.
,
Westbury
C.
.
2011
. ‘
Processing advantages of lexical bundles: Evidence from self-paced reading and sentence recall tasks
,’
Language Learning
 
61
2
:
569
613
.
Urquhart
A. H.
,
Weir
C. J.
.
1998
.
Reading in a Second Language: Process, Product, and Practice
 .
Longman
.
Vajjala
S.
,
Meurers
D.
.
2012
. ‘
On improving the accuracy of readability classification using insights from second language acquisition
,’
Proceedings of the Seventh Workshop of Building Educational Applications Using NLP
.
Association for Computational Linguistics, pp. 163–173
.
Van Gelderen
A.
,
Schoonen
R.
,
de Glopper
K.
,
Hulstijn
J.
,
Simis
A.
,
Snellings
P.
,
Stevenson
M.
.
2004
. ‘
Linguistic knowledge, processing speed and metacognitive knowledge in first and second language reading comprehension: a componential analysis
,’
Journal of Educational Psychology
 
96
1
:
19
30
.
Van Gelderen
A.
,
Schoonen
R.
,
Glopper
K. de
,
Hulstijn
J.
,
Snellings
P.
,
Simis
A.
,
Stevenson
M.
.
2003
. ‘
Roles of linguistic knowledge, metacognitive knowledge and processing speed in L3, L2 and L1 reading comprehension: a structural equation modeling approach
,’
International Journal of Bilingualism
 
7
1
:
7
25
.
Wray
A.
2002
.
Formulaic Language and the Lexicon
 .
Cambridge University Press
.
Wray
A.
2008
.
Formulaic Language: Pushing the Boundaries
 .
Oxford University Press
.
Yamashita
J.
1999
. ‘
Reading in a first and a foreign language: A study of reading comprehension in Japanese (the L1) and English (the L2)
,’
PhD thesis, Lancaster University
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data