Abstract

Most common words in English have multiple different meanings, but relatively little is known about why children grasp some meanings better than others. This study aimed to examine how variables at the child-level, wordform-level, and meaning-level impact knowledge of words with multiple meanings. In this study, 174 children aged 5- to 9-years-old completed a test of homonym knowledge, and measures of non-verbal intelligence and language background were collected. Psycholinguistic features of the wordforms tested were assessed through collecting adult ratings, corpus coding, and using existing databases. Logistic mixed effects models revealed that whilst the frequency of wordforms contributed to children’s knowledge, so also did dominance and imageability of the separate meanings of the word. Predictors were similar for children with English as an Additional Language and English as a first language. This greater understanding of why some word meanings are known better than others has significant implications for vocabulary learning.

Vocabulary underpins children’s and adult’s educational attainment (Bleses et al. 2016; Schuth et al. 2017; Masrai and Milton 2018) and thus understanding the factors that affect vocabulary learning is crucial. The process of children acquiring new vocabulary is affected by both child-level factors (e.g. first or second language acquisition (Farnia and Geva 2011)) and wordform-level factors (e.g. frequency (Elleman et al. 2017)). Research has begun to examine how these factors uniquely contribute to children’s word recognition (e.g. De Wilde et al. 2019), but this has not often extended to homonyms. Homonyms are words with multiple meanings (e.g. deck can mean a ship part or pack of cards). Most wordforms that children will learn have multiple meanings (Rodd et al. 2002; Armstrong 2012), and furthermore, for homonyms, psycholinguistic features (e.g. imageability) also vary at the meaning level (e.g. cycle as in ride a bicycle is more imageable than cycle as in recurring process). The present study addresses this gap by examining factors affecting children’s recognition of the meanings of semantically ambiguous words, including those at the child level (individual differences), and the wordform-level and meaning-level (psycholinguistic features).

PREDICTORS OF VOCABULARY LEARNING

A range of psycholinguistic features of words has been identified that may contribute to children’s vocabulary learning. One such feature is frequency. Frequency refers to the regularity with which a word is encountered. This is typically measured by calculating the frequency with which a wordform appears in a corpus of oral or written language relevant to the population (e.g. children’s television program subtitles (van Heuven et al. 2014)). Children are more likely to know definitions of words that are more frequent in school or general texts compared to words that are less frequent (Graves et al. 1980; Elleman et al. 2017) and adults are better at defining more frequently presented pseudowords (Elgort and Warren 2014). Infants also learn verbs and nouns that are higher frequency in parental speech earlier than lower frequency verbs and nouns (Goodman et al. 2008), although this relationship is reversed when comparing all parts of speech because verbs are more frequent but learnt later. These findings suggest that frequency typically has a positive impact on word knowledge, and also highlight that different psycholinguistic features interact in predicting word learning.

The part of speech that a word occupies also impacts the ease with which the word is learnt. In general, nouns tend to dominate in infants’ early lexicons, especially relative to their frequency in early language input (Goodman et al. 2008; McDonough et al. 2011; Waxman et al. 2013). Likewise, older children and adults find it easier to infer the meanings of nouns than verbs (Piccin and Waxman 2007) and adult second language learners find it easier to learn the meanings of nouns than verbs (Ellis and Beaton 1993). This benefit for nouns is often ascribed to nouns being more highly imageable than verbs (Piccin and Waxman 2007; McDonough et al. 2011).

Indeed, imageability is another important determinant of word learning. Imageability is the ease with which a word stimulates a mental image. This is closely related to concreteness, which is the extent to which a word’s referent construct is concrete (e.g. chair) versus abstract (e.g. love). Imageability is measured by obtaining subjective ratings from participants of how easily they can generate images in their minds for words (e.g. Stadthagen-Gonzalez and Davis 2006). More imageable (or concrete) words tend to be learnt more easily than less imageable (or more abstract) words by adults (e.g. Ellis and Beaton 1993; Palmer et al. 2013; Elgort and Warren 2014) and children (McFalls et al. 1996). However, one study found that in a vocabulary intervention, imageability of words had no effect on word learning after controlling for frequency (Elleman et al. 2017). Thus, imageability like frequency can positively contribute to children’s word knowledge, although when controlling for other word features the impact of imageability may be attenuated.

Semantic neighbourhood density relates to the number of words with similar semantic attributes in the same language (Durda and Buchanan 2008). A word with high semantic density will likely have several synonyms, antonyms, hypernyms, hyponyms, or co-hyponyms. For example, big has high semantic density because there are several other words that can be used in a very similar context (e.g. large, giant, small). Semantic density is calculated by using a corpus of language to assess which contexts words appear in (i.e. which other words co-occur within a set range): then, the target word’s closest neighbours are identified based on quantifying the similarity of their contexts, and the average similarity calculated to give a measure of density. Existing data provides mixed results for the effect of semantic density on word processing. Studies with adults have found positive effects of higher semantic density in lexical decision tasks (Buchanan et al. 2001; Hoffman and Woollams 2015; Danguecan and Buchanan 2016) and semantic relatedness judgement tasks for abstract words only (Danguecan and Buchanan 2016). In support of this, infant’s productive vocabulary, as measured by the MacArthur-Bates Communicative Development Inventory, contains more nouns with higher semantic density (Storkel 2009). However, one study with adults found a negative effect of semantic density on semantic relatedness judgement (Hoffman and Woollams 2015). Likewise, preschool children seem to be less willing to learn a new label for a concept for which they already know two synonyms (i.e. have higher semantic density for the child, Nicoladis and Laurent 2020). Thus, whilst semantic neighbourhood density has mixed effects on word processing and learning, the latter finding implies that we might expect it to have a negative impact on older children’s word knowledge if higher semantic density signals more known words which are synonyms.

Phonological neighbourhood density may also impact word learning. Phonological neighbourhood density is a measure of the quantity of other words which differ from the target word in only one phoneme. For example, nail has many phonological neighbours (e.g. hail, mail, snail, name, null) whereas iron has few (e.g. lion). This is often calculated by weighting neighbours for their frequency, so that less frequently occurring neighbours have a lesser effect on the phonological density. In infants’ productive vocabulary there are more words from dense neighbourhoods (Storkel 2009). However, data with children suggests that words from sparse phonological neighbourhoods may be more easily recalled or processed depending on the task (Garlock et al. 2001; Storkel 2009; Hoover et al. 2010; James et al. 2021). For example, when pseudowords that varied in phonological density were presented to 8- to 10-year-olds in stories, their recognition of the word forms from denser neighbourhoods was poorer (James et al. 2021), though their recall and meaning recognition were not affected. Thus, in tasks containing phonological distractors, it might be expected that high phonological density could be a disadvantage, because it would increase the likelihood of selecting the phonological distractor. Therefore, the impact of phonological neighbourhood density on children’s word knowledge depends on the task demands, but where recognition is required, and phonological distractors are used, it may be more likely to show a disadvantage.

In addition to psycholinguistic factors at the wordform level, individual differences—or factors at the child-level—are also crucial in predicting children’s vocabulary knowledge. Older children have larger vocabularies than younger children (e.g. Byers-Heinlein and Werker 2009; Moore and Bosch 2009). Children also tend to know more words in their first than second or additional languages (Bialystok et al. 2010; Farnia and Geva 2011). Children’s cognitive proficiency is also important, with children with greater linguistic (Rowe et al. 2012; De Wilde et al. 2020) or general cognitive aptitude (Campbell et al. 2001; Sénéchal et al. 2008) learning words more easily than children with lower aptitude.

Wordform-level factors and child-level factors may also interact with each other to affect word learning. In particular, individuals may be affected differently by wordform-level factors in their first and additional languages. For example, children show a stronger facilitation from phonological neighbourhood density for nonword recall in their native than their second language (Messer et al. 2010). Bilingual children demonstrate a reduced mutual exclusivity bias (Davidson et al. 1997), meaning that they may be more open to words having several synonyms, and therefore could be less affected by semantic neighbourhood density. On the contrary, some word features and especially those that do not vary across languages (e.g. imageability), are likely to have a similar impact on L1 and L2 learners. Thus, there is a need to explore the possible differences in wordform-level variables affecting homonym knowledge for L1 and L2 learners. Because these factors exist at different levels of analysis (the wordform and the child), multilevel models could be valuable in examining these interactions.

PREDICTORS OF HOMONYM LEARNING

Much existing research on factors affecting word learning has overlooked the multiple meanings of wordforms and examined them instead as unitary entities. In reality, many words are homonyms, that is, they have multiple meanings (e.g. deck as in a ship’s part or a pack of cards). Investigating children’s word-learning in relation to homonyms is important for three key reasons. Firstly, many or most wordforms are thought to have multiple meanings (Rodd et al. 2002; Armstrong 2012; Brysbaert and Biemiller 2017) and more commonly used words tend to have more meanings (Zipf 1945). For example, in a set of 37,000 single words from the Wordsmyth dictionary, 54 per cent had more than one sense listed, with 10 per cent having 5 or more (Armstrong 2012). Thus, to understand children’s word-learning fully, it is critical to understand how children learn the meanings of homonyms. Secondly, some important psycholinguistic factors (i.e. imageability and part of speech) differ between word meanings (i.e. at the meaning level): for example, cycle as in to ride a bike is more imageable than cycle as in a recurring event. Such differences could explain why some meanings of known wordforms are learnt much later (Brysbaert and Biemiller 2017). Existing language norm databases for factors such as frequency (van Heuven et al. 2014) and imageability ratings (Brysbaert et al. 2014) typically include single entries for wordforms, and such averages tell us nothing about how these different meanings are learnt. Thus, it is essential for research to examine multiple meanings of wordforms and psycholinguistic factors that vary at the meaning level. Thirdly, additional psycholinguistic factors may come into play with homonyms: these include the number of senses of the word; the relative dominance of the word meanings tested; and the relatedness of the word meanings tested. Research examining these factors is discussed below.

Dominance is a critical meaning-level feature that has large effects on adults’ processing of homonyms in their first language, and limited data with children suggests a similar pattern. Dominance is the relative psychological salience of a word’s separate meanings. Whilst dominance is dynamic and can vary, both between individuals and within individuals based on experience (Rodd et al. 2016), many wordforms have a relatively stable dominant meaning. For example, for the word school, the meaning of an educational institution is likely more dominant than the meaning of a group of fish, in that it is more likely to be the first meaning that comes to mind. Dominance can be approximated by categorizing instances from corpora or word associates for which word meaning they relate to (e.g. Parent 2012). Dominance is often dichotomized by selecting wordforms for which there is one highly dominant meaning, but in reality, exists on a spectrum. Dominant word meanings are processed more quickly by adults than subordinate meanings in lexical decision tasks (Tabossi et al. 1987; Duffy et al. 1988; Foraker and Murphy 2012) and when contexts support subordinate meanings, the dominant sense still acts as a distractor (Chen and Boland 2008) but less so vice versa. This suggests that dominant meanings are processed more easily by adults. With children, who are typically still acquiring the subordinate meanings of homonyms (Brysbaert and Biemiller 2017), there is likely to be an even more significant impact of dominance on word knowledge. One study with children found a similar pattern of results to adults: when given a sentence which disambiguated one meaning of a familiar homonym (e.g. Bill fished from the bank), children were more accurate at judging whether a subsequent image (e.g. money or a river) matched the sentence when it contained the dominant meaning of the homonym (Norbury 2005). This implies that for children, more dominant meanings are recognized in context more accurately. However, these existing studies do not show whether more dominant meanings are more likely to be known by children, that is, whether dominant and subdominant meanings have entered their receptive vocabulary yet.

Relatedness of word meanings is also an important feature of semantically ambiguous wordforms, which may influence children’s word learning. Relatedness refers to the extent to which two-word meanings are conceptually similar, and runs on a continuum from completely unrelated (e.g. bank for money vs. river bank) to highly related (e.g. verb and noun forms of snow, Eddington and Tokowicz 2015). Whilst relatedness can be based on etymological histories of words, researchers are typically interested in subjective perceptions of relatedness and so this is usually measured via averaging ratings of meaning relatedness (Rodd et al. 2002). Research with adults shows that wordforms with more related meanings are processed and learnt more easily than wordforms with unrelated meanings in a variety of semantic tasks (Eddington and Tokowicz 2015; Maciejewski et al. 2020). There is also some suggestion that in an L2, adults may more readily accept related meanings of homonyms than unrelated meanings (Maby 2016). Whilst there appears to be very limited data on this topic with children, one recent study suggested that children and adults learn the meanings of novel homonyms better when their two meanings are related than if they are unrelated (Floyd and Goldberg 2020). Therefore, it seems that more highly related word meanings may be easier to learn than unrelated word meanings.

Another unique feature of homonyms that may affect their acquisition is the total number of senses that the word has. Word learning studies with adults and children suggest that it is harder to learn words with two meanings than single meanings (Doherty 2004; Degani and Tokowicz 2010) but polysemous words can have up to 35 different senses according to separate dictionary entries (Armstrong 2012). One study with L2 adult learners found that the total number of senses of polysemous words did not predict whether a word was used in productive vocabulary by these learners, whereas frequency did (Crossley and Salsbury 2010). To our knowledge, there is not yet any data with children documenting how the total number of senses a word has affects word processing or learning. After controlling for frequency (which is positively correlated with the number of senses, Zipf 1945), the number of senses of a word may have a negative impact on recognition of word meanings, because these other senses may create an indistinct concept of the word’s separate senses, and thus distract the child from selecting the correct meaning. Conversely, it could be that having more senses entails a more elaborate mental representation of a word, and thus would facilitate knowledge of a word’s senses.

As with vocabulary generally, child-level factors also impact children’s knowledge of homonyms. Specifically, older children know more homonyms than younger children (Brysbaert and Biemiller 2017; Booton et al. 2021), and children with English as a first language (EL1) know more than those with English as an additional language (EAL) (Carlo et al. 2004; Booton et al. 2021). However, the difference between children could be explained by wordform-level features of homonyms: for example, if children with EAL struggle more with low-frequency English words or less dominant meanings, then this may drive the difference between groups. The present research will examine features at both the meaning, wordform, and child level simultaneously to address such issues, as well as examining the interaction between first language and word features.

AIMS OF THE PRESENT RESEARCH

There remain a number of gaps in the literature regarding the factors affecting children’s learning of homonyms. Firstly, there are relatively few studies with children, and especially children learning English as a second language. Secondly, some important variables have been overlooked, both at the wordform-level (e.g. the total number of senses) and the meaning-level (e.g. imageability). Thirdly, there are few studies that pit multiple predictor variables against each other, including those at the meaning level, wordform level, and child level. This is important because of evident intercorrelations between psycholinguistic features (Zipf 1945; McDonough et al. 2011) and interactions between wordform and child-level factors (Klepousniotou et al. 2008; De Wilde et al. 2020).

The aim of this research is to examine which factors support recognition of homonyms for children with EAL and EL1. The primary research question was:

1. Do wordform-level (frequency, relatedness, semantic density, and phonological density) and meaning-level (dominance, imageability, part of speech, senses) factors make homonyms more or less likely to be recognized by children?

A further exploratory research question was:

2. Do psycholinguistic predictors of children’s recognition of homonyms differ between EAL and non-EAL children?

To address these questions, an experiment was conducted with a multilevel design. Children’s (N =174) receptive knowledge of two meanings of a set of 32 homonyms was measured, and the impact of meaning-level, wordform-level, and child-level factors on this knowledge was examined.

METHOD

Design

A nested multilevel design was used with word meanings nested within wordforms nested within participants. The child-level factors were language status (EAL or EL1), and non-verbal reasoning and age as continuous variables. The wordform-level factors were psycholinguistic features of the words: word frequency, relatedness of the two word-meanings, total number of senses of the wordform, semantic neighbourhood density, and phonological neighbourhood density. The meaning-level factors were separate measures for the two meanings of each wordform tested: dominance, imageability, and part of speech. The outcome variable was accuracy on a receptive vocabulary measure (0/1).

Participants

Participants were 174 children (84 female), recruited through 5 state-funded schools in southern England. Children were from school Years 1, 3, and 4 (aged 5 years 6 months to 9 years 8 months, M =7.60, SD = 1.24). Details of the participants in each group are shown in Table 1.

Table 1:

Details of participants in each group in the sample

Language statusNFemale (per cent)Age (SD)Nonverbal reasoning (SD)
EAL7847.47.77 (1.12)13.62 (4.54)
EL19649.07.47 (1.32)12.05 (4.12)
Full sample17448.37.60 (1.24)12.75 (4.37)
Language statusNFemale (per cent)Age (SD)Nonverbal reasoning (SD)
EAL7847.47.77 (1.12)13.62 (4.54)
EL19649.07.47 (1.32)12.05 (4.12)
Full sample17448.37.60 (1.24)12.75 (4.37)

EAL, English as an Additional Language; EL1, English as a first language.

Table 1:

Details of participants in each group in the sample

Language statusNFemale (per cent)Age (SD)Nonverbal reasoning (SD)
EAL7847.47.77 (1.12)13.62 (4.54)
EL19649.07.47 (1.32)12.05 (4.12)
Full sample17448.37.60 (1.24)12.75 (4.37)
Language statusNFemale (per cent)Age (SD)Nonverbal reasoning (SD)
EAL7847.47.77 (1.12)13.62 (4.54)
EL19649.07.47 (1.32)12.05 (4.12)
Full sample17448.37.60 (1.24)12.75 (4.37)

EAL, English as an Additional Language; EL1, English as a first language.

Children were classified as having EAL if English was not their first language. English was an additional language for 45 per cent of the sample. All other children had English as a first language and so were classified as EL1. Some of the EL1 group (29 per cent) were exposed to another language at home, but with English being their first language. Children with EAL were asked to report the language they used most commonly at home, and 87 per cent of EAL children were able to. Twenty-three different languages were reported in total, the most common being Polish (n =17), Arabic (n =6), Tetum (n =6), and Albanian (n =4). According to parent or teacher reports, most children with EAL began learning English at the first year of school entry at age 4 (58.3 per cent), although some began learning English before school (27.8 per cent), or after the first year of school entry (13.9 per cent). On average, children with EAL had an estimated 3.62 years of exposure to English (SD = 1.45, min = 0.41, max = 7.08).

Measures

A summary of all measures used in the study is shown in Table 2.

Table 2:

Measures used in the study

TypeFactorSourceObserved rangePossible rangeMeasure
OutcomeAccuracyRPVT0/10/1Incorrect = 0, Correct = 1
Child levelL1 EnglishLBQ0/10/1EAL = 0, EL1 = 1
AgeLBQ5.52–9.665.43–9.78Years between date of test and date of birth
Non-verbal reasoningWISC matrix reasoning2–220–32Raw total correct
Wordform levelFrequencySUBTLEX-UK3.65–5.811.86–7.57Log frequency score for CBBC programmes
RelatednessAdult ratings1.43–4.751.00–7.00Mean relatedness rating
SensesWordsmyth dictionary2–282–35Total number of entries
Semantic densitySemantic neighbourhood App1.68–3.121.00–3.33aReciprocal of semantic density for 50 closest semantic neighbours
Phonological densityIrvine phonotactic online dictionary0.76–92.380.00–100.00aFrequency-weighted score of number of phonological neighbours
Meaning levelDominanceOxford children’s corpus coding0–1000–100Percentage of instances involving meaning 1 or meaning 2
Part of speechOxford children’s corpus coding0–1000–100Percentage of instances that are nouns
ImageabilityAdult ratings2.17–6.721.00–7.00Mean imageability rating
TypeFactorSourceObserved rangePossible rangeMeasure
OutcomeAccuracyRPVT0/10/1Incorrect = 0, Correct = 1
Child levelL1 EnglishLBQ0/10/1EAL = 0, EL1 = 1
AgeLBQ5.52–9.665.43–9.78Years between date of test and date of birth
Non-verbal reasoningWISC matrix reasoning2–220–32Raw total correct
Wordform levelFrequencySUBTLEX-UK3.65–5.811.86–7.57Log frequency score for CBBC programmes
RelatednessAdult ratings1.43–4.751.00–7.00Mean relatedness rating
SensesWordsmyth dictionary2–282–35Total number of entries
Semantic densitySemantic neighbourhood App1.68–3.121.00–3.33aReciprocal of semantic density for 50 closest semantic neighbours
Phonological densityIrvine phonotactic online dictionary0.76–92.380.00–100.00aFrequency-weighted score of number of phonological neighbours
Meaning levelDominanceOxford children’s corpus coding0–1000–100Percentage of instances involving meaning 1 or meaning 2
Part of speechOxford children’s corpus coding0–1000–100Percentage of instances that are nouns
ImageabilityAdult ratings2.17–6.721.00–7.00Mean imageability rating
a

Approximate possible range.

Table 2:

Measures used in the study

TypeFactorSourceObserved rangePossible rangeMeasure
OutcomeAccuracyRPVT0/10/1Incorrect = 0, Correct = 1
Child levelL1 EnglishLBQ0/10/1EAL = 0, EL1 = 1
AgeLBQ5.52–9.665.43–9.78Years between date of test and date of birth
Non-verbal reasoningWISC matrix reasoning2–220–32Raw total correct
Wordform levelFrequencySUBTLEX-UK3.65–5.811.86–7.57Log frequency score for CBBC programmes
RelatednessAdult ratings1.43–4.751.00–7.00Mean relatedness rating
SensesWordsmyth dictionary2–282–35Total number of entries
Semantic densitySemantic neighbourhood App1.68–3.121.00–3.33aReciprocal of semantic density for 50 closest semantic neighbours
Phonological densityIrvine phonotactic online dictionary0.76–92.380.00–100.00aFrequency-weighted score of number of phonological neighbours
Meaning levelDominanceOxford children’s corpus coding0–1000–100Percentage of instances involving meaning 1 or meaning 2
Part of speechOxford children’s corpus coding0–1000–100Percentage of instances that are nouns
ImageabilityAdult ratings2.17–6.721.00–7.00Mean imageability rating
TypeFactorSourceObserved rangePossible rangeMeasure
OutcomeAccuracyRPVT0/10/1Incorrect = 0, Correct = 1
Child levelL1 EnglishLBQ0/10/1EAL = 0, EL1 = 1
AgeLBQ5.52–9.665.43–9.78Years between date of test and date of birth
Non-verbal reasoningWISC matrix reasoning2–220–32Raw total correct
Wordform levelFrequencySUBTLEX-UK3.65–5.811.86–7.57Log frequency score for CBBC programmes
RelatednessAdult ratings1.43–4.751.00–7.00Mean relatedness rating
SensesWordsmyth dictionary2–282–35Total number of entries
Semantic densitySemantic neighbourhood App1.68–3.121.00–3.33aReciprocal of semantic density for 50 closest semantic neighbours
Phonological densityIrvine phonotactic online dictionary0.76–92.380.00–100.00aFrequency-weighted score of number of phonological neighbours
Meaning levelDominanceOxford children’s corpus coding0–1000–100Percentage of instances involving meaning 1 or meaning 2
Part of speechOxford children’s corpus coding0–1000–100Percentage of instances that are nouns
ImageabilityAdult ratings2.17–6.721.00–7.00Mean imageability rating
a

Approximate possible range.

Homonym knowledge test

The Receptive Polysemy Vocabulary Test (RPVT) (Booton et al. 2021) was used to assess children’s recognition of two meanings of homonyms. Briefly, in this test children see an array of six pictures and have to select two pictures that showed two different meanings of the homonyms presented (an example item is shown in Figure 1). The stimuli consisted of 32 polysemous wordforms contained within the RPVT (Booton et al. 2021), including the test items and 2 practice questions. The words are listed with their two tested meanings and psycholinguistic measures in Supplementary materials (Table S1), along with more details about the stimuli. The test was conducted on a tablet and was self-paced. Items were scored for accuracy (correct or incorrect) separately for the two meanings and the test showed good reliability in this sample (Cronbach’s α = 0.875).

Example item from the RPVT for pupil.
Figure 1

Example item from the RPVT for pupil.

Child language background

A short survey was conducted to ascertain the language background of the participants. The survey was completed by either the child’s parent (n =112) or teacher (n =62). In the parent survey, parents were asked to indicate children’s first and additional languages. If English was not a first language, the child was categorized as having EAL. In the teacher survey, teachers were asked whether the child had EAL according to school records; whether they had EAL according to the definition of the child ‘not having English as a native (first) language’; and if they did not have a parent at home who was a native English speaker. Children were classified as having EAL if at least two out of the three responses were yes. In both surveys, adults were asked to indicate at what age children began learning English (teachers sometimes consulted with colleagues working in reception year to do so). These were classified into three groups: before school, at the start of school in reception year (age 4), or after reception year.

Non-verbal reasoning

Children’s non-verbal reasoning was assessed using the Matrix Reasoning subtest of the Weschler Intelligence Scale for Children fifth edition (WISC-MR; Wechsler, 2016). Raw scores (the number of items answered correctly) were taken as the outcome measure with a possible range from 0 to 32 (actual range: 2–22, M =12.76, SD = 4.37).

Psycholinguistic measures

Meaning-level factors

The following measures varied at the level of the word meaning, that is, the values of dominance, part of speech, and imageability were computed separately for each word meaning (e.g. duck the bird and duck the action have separate values for each of these factors).

Dominance

To assess the dominance of the two meanings of the homonyms tested separately, concordance lines from the Oxford Children’s Corpus (OCC) 2017 reading corpus were coded (similar to Parent 2012). The OCC 2017 reading corpus consists of 35 million words and 12,000 documents, including fiction, non-fiction, educational materials, and websites, written for children aged 5–16 years old. Concordance lines for 400 instances of each word were coded based on definitions from the Wordsmyth dictionary (Parks et al. 2020), into the primary meaning, secondary meaning, or another meaning. Further details about the coding process can be found in Supplementary materials. One trained rater coded all concordance lines. An additional independent rater coded 10 per cent of the included concordance lines (160 lines across 8 words) and inter-rater agreement was excellent (κ = 0.954). Scores consisted of the percentage of instances for meaning 1 and meaning 2 (which added to 100 per cent). Dominance ranged from 0 per cent to 100 per cent (M =49.96, SD = 34.22).

Part of speech

To assess the part of speech of the two meanings of the polysemous words, uses from the Oxford Children’s Corpus were coded. Whilst coding concordance lines for frequency (see above), the primary rater also assessed the part of speech of the use as either a noun, verb, or modifier (adjective or adverb). Because few word meanings (9 per cent) were consistently used as verbs or adjectives, and nouns tend to be the easiest wordforms to learn and process (Piccin and Waxman 2007; Kauschke and Stenneken 2008), a continuous measure of the percentage of instances used as nouns was taken. This ranged from 0 to 100 per cent (M =75.59, SD = 36.51).

Imageability

To assess the imageability of the two meanings of the polysemous words tested separately, adults’ ratings were obtained following methods from previous research (Stadthagen-Gonzalez and Davis 2006). Ratings were collected via a survey in Qualtrics (Qualtrics Labs Inc., Provo, UT). Adults with EL1 (N =40) aged 19–63 years (M =28.48; SD = 13.85) completed imageability ratings for each of the 64 word-meanings on a scale from 1 (low imageability) to 7 (high imageability). Imageability was defined as how easily a word ‘provokes a mental image (i.e. a visual mental picture)’. Example items were completed which varied in imageability (car, writer, vapour, and democracy) prior to rating the RPVT items to anchor participants in using the scale. The average imageability ratings were towards the higher end (M =5.18, SD = 1.10, min = 2.18, max = 6.73).

Wordform-level factors

The following measures varied at the level of the wordform, that is, one value was computed of frequency, number of word senses, relatedness of word senses, semantic neighbourhood density, and phonological neighbourhood density (e.g. there is a single measure of each of these factors for duck, regardless of whether it refers to the bird or the action, or another meaning).

Frequency

The frequency of each wordform in children’s TV programs was extracted from the SUBTLEX-UK database (van Heuven et al. 2014). The log frequency (Zipf) score for CBBC programmes (TV shows aimed at 6 to 12-year-olds) was used, and frequencies for the wordforms ranged from 3.65 to 5.81 (M =4.63, SD = 0.52).

Relatedness

To assess the relatedness of the two meanings of the homonyms, adult ratings were obtained following methods from previous research (Rodd et al. 2002). The same adults (N =40) who completed imageability ratings subsequently rated each of the 32 words from the RPVT on the similarity of its two meanings, on a scale from 1 (not at all related) to 7 (highly related). Example items were presented which varied in relatedness (brush: noun and verb forms; mouse: animal and computer; wood: forest and material; lie: recline and fib) prior to rating the RPVT items to anchor participants in using the scale. The averaged relatedness ratings were towards the lower end (M =2.93, SD = 1.15, min = 1.43, max = 4.75).

Number of word senses

The number of word senses was obtained from a database using the Wordsmyth dictionary (Armstrong 2012): the total number of dictionary entries for the wordform was used. The number of senses ranged from 2 to 28 (M =10.31, SD = 5.35).

Semantic neighbourhood density

Whilst one would ideally estimate semantic neighbourhood density at the meaning level, no existing sources could be found that indexed semantic density separately by word meanings. Thus, it was estimated at the word level only. The semantic neighbourhood density was derived from the semantic distance estimates in the Semantic Neighbourhood App (Lutfallah et al. 2018). These distance estimates are based on first calculating the frequency of co-occurrence of all wordforms in the Wikipedia Corpus (Shaoul and Westbury 2010), and then estimating the size of the difference between wordforms in patterns of co-occurrence. The average distance was calculated for each wordform for its 50 closest semantic neighbours. To transform distance into density, the reciprocal was taken (1 divided by the distance). Thus, higher scores indicate higher density. The semantic neighbourhood density ranged from 1.68 to 3.12 (M =2.39, SD = 0.38).

Phonological neighbourhood density

The phonological neighbourhood density was obtained from the Irvine Phonotactic Online Dictionary, version 2.0 (Vaden et al. 2009). The phonological neighbourhood density weighted by frequency (from SUBTLEXus) was used (column labelled stressed log 10 SF). This indicates the number of words that differ from the target word by one phoneme, controlling for the frequency of the phonological neighbour words. This weighted score ranged from 0.76 to 92.38 (M =25.50, SD = 19.65).

Procedure

The child homonym knowledge data was collected as part of two other research projects, one published (Booton et al. 2021) and the other unpublished. The data from Booton et al. (2021) was from the first session of the RPVT with the full sample (n =112). The unpublished data was from the RPVT in a pre-test session of an intervention study (n =63) which had to be discontinued due to the Covid-19 pandemic.

RESULTS

Relationships between psycholinguistic factors

Bivariate correlations between wordform and meaning-level factors are shown in Table 3. There is a positive association between part of speech and imageability, indicating that meanings that are more commonly used as nouns are more imageable. There is also a positive association between frequency and senses, indicating that more frequent wordforms tended to have more senses, and frequency and phonological density, indicating that more frequent wordforms follow more common phonological patterns.

Table 3:

Correlations between wordform and meaning-level psycholinguistic variables for words from the homonym knowledge test

1234567
1. Dominance
2. Part of speech0.069
3. Imageability−0.0860.272*
4. Frequency−0.176−0.001
5. Relatedness−0.132−0.1180.214
6. Senses−0.1490.0190.345**0.088
7. Sem. density−0.1370.0410.128−0.0470.095
8. Phon. density−0.2150.1690.314*−0.1750.1130.180
1234567
1. Dominance
2. Part of speech0.069
3. Imageability−0.0860.272*
4. Frequency−0.176−0.001
5. Relatedness−0.132−0.1180.214
6. Senses−0.1490.0190.345**0.088
7. Sem. density−0.1370.0410.128−0.0470.095
8. Phon. density−0.2150.1690.314*−0.1750.1130.180
*

p < 0.05;

**

p < 0.01.

Note that correlations between the meaning-level variable of dominance and wordform-level variables are missing because by definition there can be no correlation, as there is no wordform-level variance (dominance adds to 100 per cent across the two meanings for each word).

Table 3:

Correlations between wordform and meaning-level psycholinguistic variables for words from the homonym knowledge test

1234567
1. Dominance
2. Part of speech0.069
3. Imageability−0.0860.272*
4. Frequency−0.176−0.001
5. Relatedness−0.132−0.1180.214
6. Senses−0.1490.0190.345**0.088
7. Sem. density−0.1370.0410.128−0.0470.095
8. Phon. density−0.2150.1690.314*−0.1750.1130.180
1234567
1. Dominance
2. Part of speech0.069
3. Imageability−0.0860.272*
4. Frequency−0.176−0.001
5. Relatedness−0.132−0.1180.214
6. Senses−0.1490.0190.345**0.088
7. Sem. density−0.1370.0410.128−0.0470.095
8. Phon. density−0.2150.1690.314*−0.1750.1130.180
*

p < 0.05;

**

p < 0.01.

Note that correlations between the meaning-level variable of dominance and wordform-level variables are missing because by definition there can be no correlation, as there is no wordform-level variance (dominance adds to 100 per cent across the two meanings for each word).

Logistic mixed effects models

Due to the multivariate nature of the data, and the research questions focusing on factors at the level of participants and wordforms/meanings, the data were analysed using logistic mixed effects models using the lme4 package (Bates et al. 2015) in R version 4.0.2 (R Development Core Team 2017). The dataset and analysis code are available at https://osf.io/2ax7q/. The BOBYQA algorithm (Powell 2009) was used for optimization and sjplot (Lüdecke 2020) to generate tables and plots as well as R2 estimates (Nakagawa et al. 2017). Prior to the analysis, all continuous fixed factors were transformed to reduce skew, and then centred and scaled to create z-scores, and all dichotomous fixed factors were centred so that main effects were estimated as average effects over all levels of the other variables (rather than at a specific reference level for each factor).

To construct the maximal base model (i.e. the most complex model, which converged with a non-singular fit (Barr et al. 2013)) we began with a model with random intercepts for participant (with 174 levels) and wordform (with 32 levels); fixed effects for eleven variables (see Table 2), comprising all predictors at the level of the participant, the wordform, and the meaning; random by-participant slopes for all eight wordform-level and meaning-level fixed effects; random by-wordform slopes for all three child-level fixed effects and all three meaning-level fixed effects; and all correlations between slopes. This initial model had a singular fit, so the model was simplified as described in Supplementary materials. This process led to a base model (Model 1) which contained random intercepts for participants and wordforms; random slopes by-wordform for age, non-verbal reasoning, dominance, imageability, and part of speech, with no correlations between the random intercept for wordforms and random slopes by-wordforms; and all eleven fixed effects, comprising those at the level of the participant, the wordform, and the meaning. Collinearity was checked for all variables in all models and was low (VIF ≤ 1.36).

Question 1: Which wordform-level and meaning-level factors make homonyms more or less likely to be recognized by children?

To answer this question, we used our base logistic mixed effects model. The model is shown in the left columns of Table 4 (for tables of random effects for all models, see Supplementary Table S1).

Table 4:

Results of linear mixed effects models predicting accuracy for Questions 1 and 2

Model 1
Model 2
PredictorsORCIpORCIp
(Intercept)3.612.59–5.05<0.0013.642.60–5.09<0.001
L1 English1.931.59–2.35<0.0012.001.63–2.44<0.001
Age1.591.39–1.81<0.0011.591.40–1.82<0.001
WISC1.461.29–1.65<0.0011.461.29–1.65<0.001
Frequency1.721.19–2.500.0041.731.19–2.510.004
Relatedness1.360.97–1.910.0741.360.97–1.910.076
Senses1.260.89–1.780.1941.270.89–1.800.183
Semantic density0.910.66–1.260.5860.910.66–1.260.572
Phonological density1.000.71–1.420.9841.000.71–1.420.986
Dominance2.171.47–3.20<0.0012.171.47–3.20<0.001
Part of speech0.720.46–1.120.1470.720.46–1.120.145
Imageability2.111.37–3.250.0012.131.39–3.290.001
L1 English × Frequency0.990.87–1.140.914
L1 English × Relatedness0.970.87–1.080.624
L1 English × Senses1.080.96–1.210.189
L1 English × Sem. density0.940.85–1.040.258
L1 English × Phon. density0.990.89–1.110.917
L1 English × Dominance1.010.90–1.130.891
L1 English × Part of speech0.950.84–1.060.346
L1 English × Imageability1.161.03–1.310.016
Marginal R20.3520.356
Conditional R20.4810.485
Model 1
Model 2
PredictorsORCIpORCIp
(Intercept)3.612.59–5.05<0.0013.642.60–5.09<0.001
L1 English1.931.59–2.35<0.0012.001.63–2.44<0.001
Age1.591.39–1.81<0.0011.591.40–1.82<0.001
WISC1.461.29–1.65<0.0011.461.29–1.65<0.001
Frequency1.721.19–2.500.0041.731.19–2.510.004
Relatedness1.360.97–1.910.0741.360.97–1.910.076
Senses1.260.89–1.780.1941.270.89–1.800.183
Semantic density0.910.66–1.260.5860.910.66–1.260.572
Phonological density1.000.71–1.420.9841.000.71–1.420.986
Dominance2.171.47–3.20<0.0012.171.47–3.20<0.001
Part of speech0.720.46–1.120.1470.720.46–1.120.145
Imageability2.111.37–3.250.0012.131.39–3.290.001
L1 English × Frequency0.990.87–1.140.914
L1 English × Relatedness0.970.87–1.080.624
L1 English × Senses1.080.96–1.210.189
L1 English × Sem. density0.940.85–1.040.258
L1 English × Phon. density0.990.89–1.110.917
L1 English × Dominance1.010.90–1.130.891
L1 English × Part of speech0.950.84–1.060.346
L1 English × Imageability1.161.03–1.310.016
Marginal R20.3520.356
Conditional R20.4810.485
Table 4:

Results of linear mixed effects models predicting accuracy for Questions 1 and 2

Model 1
Model 2
PredictorsORCIpORCIp
(Intercept)3.612.59–5.05<0.0013.642.60–5.09<0.001
L1 English1.931.59–2.35<0.0012.001.63–2.44<0.001
Age1.591.39–1.81<0.0011.591.40–1.82<0.001
WISC1.461.29–1.65<0.0011.461.29–1.65<0.001
Frequency1.721.19–2.500.0041.731.19–2.510.004
Relatedness1.360.97–1.910.0741.360.97–1.910.076
Senses1.260.89–1.780.1941.270.89–1.800.183
Semantic density0.910.66–1.260.5860.910.66–1.260.572
Phonological density1.000.71–1.420.9841.000.71–1.420.986
Dominance2.171.47–3.20<0.0012.171.47–3.20<0.001
Part of speech0.720.46–1.120.1470.720.46–1.120.145
Imageability2.111.37–3.250.0012.131.39–3.290.001
L1 English × Frequency0.990.87–1.140.914
L1 English × Relatedness0.970.87–1.080.624
L1 English × Senses1.080.96–1.210.189
L1 English × Sem. density0.940.85–1.040.258
L1 English × Phon. density0.990.89–1.110.917
L1 English × Dominance1.010.90–1.130.891
L1 English × Part of speech0.950.84–1.060.346
L1 English × Imageability1.161.03–1.310.016
Marginal R20.3520.356
Conditional R20.4810.485
Model 1
Model 2
PredictorsORCIpORCIp
(Intercept)3.612.59–5.05<0.0013.642.60–5.09<0.001
L1 English1.931.59–2.35<0.0012.001.63–2.44<0.001
Age1.591.39–1.81<0.0011.591.40–1.82<0.001
WISC1.461.29–1.65<0.0011.461.29–1.65<0.001
Frequency1.721.19–2.500.0041.731.19–2.510.004
Relatedness1.360.97–1.910.0741.360.97–1.910.076
Senses1.260.89–1.780.1941.270.89–1.800.183
Semantic density0.910.66–1.260.5860.910.66–1.260.572
Phonological density1.000.71–1.420.9841.000.71–1.420.986
Dominance2.171.47–3.20<0.0012.171.47–3.20<0.001
Part of speech0.720.46–1.120.1470.720.46–1.120.145
Imageability2.111.37–3.250.0012.131.39–3.290.001
L1 English × Frequency0.990.87–1.140.914
L1 English × Relatedness0.970.87–1.080.624
L1 English × Senses1.080.96–1.210.189
L1 English × Sem. density0.940.85–1.040.258
L1 English × Phon. density0.990.89–1.110.917
L1 English × Dominance1.010.90–1.130.891
L1 English × Part of speech0.950.84–1.060.346
L1 English × Imageability1.161.03–1.310.016
Marginal R20.3520.356
Conditional R20.4810.485

The marginal R2 of Model 1 suggests that the fixed effects included account for 35.2 per cent of the variance in accuracy. This model demonstrates significant unique effects of dominance, imageability, and frequency. The odds ratio of dominance indicates a medium effect size, and that increasing dominance by 1 SD increases knowledge of that word sense by 2.17 times. The odds ratio of imageability indicates that increasing imageability by 1 SD increases knowledge of that word sense by 2.11 times. For frequency, the odds ratio shows that increasing the total frequency of the wordform in children’s text by 1 SD increases knowledge of meanings of that word by 72 per cent. There was also a trend for a positive effect of relatedness, which hinted that increasing the relatedness of the two-word meanings by 1 SD might increase knowledge of meanings of that word by around 36 per cent, although because the confidence interval contains 1 this should be interpreted with caution. There was no additional effect of part of speech, senses, semantic neighbourhood density, or phonological neighbourhood density. Thus, both wordform and meaning-level factors contribute uniquely to children’s knowledge of homonyms after controlling for child-level factors.

Question 2: Do psycholinguistic predictors of children’s recognition of homonyms differ between EAL and Non-EAL children?

To assess whether wordform-level and meaning-level factors impact recognition of homonyms differently for EAL compared to EL1 children, interactions were added between language status and all 8 wordform-level and meaning-level factors. The model would not converge with random by-participant slopes for the interactions, so these were not included. For this more exploratory question, correction for multiple comparisons was conducted with an adjusted threshold for significance of p = 0.05/8 = 0.00625. As shown in Table 4, the fixed effects accounted for 35.6 per cent of the variance in accuracy, adding a negligible 0.4 per cent to Model 1, and indeed the chi-squared test comparing the models did not find a significant improvement between Model 1 and Model 2 (X2 (8) = 9.00, p = 0.324). At the adjusted significance level, none of the interactions between language status and any of the wordform-level and meaning-level factors were significant. Graphs of the predicted values from the model for these interactions are shown in Supplementary materials (Figure S2). Thus, no evidence was found that wordform-level and meaning-level factors impact recognition of homonyms differently for EAL compared to EL1 children.

DISCUSSION

This study was the first to investigate the effect of meaning-, wordform-, and child-level factors on children’s knowledge of the multiple meanings of words. It found that all three levels (meaning level, wordform level, and child level) were independently important. The dominance of word meanings (i.e. the percentage of instances of the wordform in children’s texts that had this meaning) positively predicted knowledge. This suggests that the rate at which children encounter specific meanings of words in texts is important for their word learning. This is consistent with data showing that adults and children recognize dominant word meanings faster and more accurately than subdominant meanings (Norbury 2005; Foraker and Murphy 2012). Our data extend this by showing that continuous variation in dominance predicts understanding of word meanings. This implies that children develop a better understanding of the multiple meanings of words when they encounter these meanings in texts.

Another meaning-level factor that was important was the imageability of word meanings. More imageable word meanings were better known. This mirrors findings from learning experiments with adults and children that imageability ratings for words overall support learning (e.g. McFalls et al. 1996; Palmer et al. 2013). The present data suggest that imageability ratings for specific word meanings of homonyms also predict children’s word knowledge. There remained an effect of imageability after controlling for frequency (and dominance) in this study, unlike one previous study (Elleman et al. 2017), suggesting that imageability and frequency can contribute independently to knowledge of the multiple meanings of words. Imageability may have particular importance for the test used in this study, because more imageable words should be easier to capture and recognize from pictorial representations.

With respect to wordform-level factors, we found a unique positive effect of word frequency on homonym knowledge. This replicates the well-established finding that the frequency with which a word appears in text contributes to adult’s and children’s knowledge of homonyms (Graves et al. 1980; Elleman et al. 2017) and suggests that even after accounting for other factors (including the dominance of each word meaning, as measured by their relative frequency), the total frequency of the wordform remains important.

No evidence was found for a significant impact of part of speech, meaning relatedness, total number of senses, semantic density, or phonological density in this study. Perhaps most surprising is the lack of effect of part of speech, given that nouns tend to be learnt earlier than verbs (Bloom et al. 1993; Piccin and Waxman 2007; Goodman et al. 2008). However, previous studies have suggested that the noun advantage is due to higher imageability (Piccin and Waxman 2007), which was controlled in this study. Thus, perhaps after controlling for imageability, alongside other factors, there is no noun advantage. For the total number of senses, there is no existing data with which to compare. It could be that this variable is genuinely irrelevant, or that the effect was too small to detect in this study. There was no significant effect of relatedness of word meanings on homonym knowledge, although the effect trended in a positive direction. Some other studies have suggested that homonyms with related meanings are processed more quickly in semantic tasks (Eddington and Tokowicz 2015) and also learnt more easily by children and adults (Floyd and Goldberg 2020; Maciejewski et al. 2020). The lack of significance may be partly because the range of relatedness values was somewhat restricted, with more homonyms than polysemes in this study, but more data would be needed to verify this.

No effect was found for semantic density either. Previous studies have found a positive impact of semantic density on word processing (Buchanan et al. 2001; Danguecan and Buchanan 2016), but a negative impact of semantic density, in terms of synonyms known by children, on word learning (Nicoladis and Laurent 2020). However, there is no existing data on this topic relating to homonyms: the data here find no evidence for a consistent effect of semantic density at the wordform-level on homonym knowledge. However, because the index of semantic density only exists at the wordform-level, and does not weight for frequency, it does not provide the best proxy of whether children are likely to already know a word with a similar meaning. Thus, it remains unclear whether semantic density at the meaning-level would make a difference. There was also no significant effect of phonological density in the present study. This could be because, after controlling for other factors, phonological density has limited impact on children’s homonym knowledge. Phonological density has previously been linked to both learning benefits and disadvantages depending on how knowledge is assessed (Garlock et al. 2001; Storkel 2009; Hoover et al. 2010) so it could also be that these effects cancelled each other out.

The findings also suggest that the difference in homonym knowledge between EAL and EL1 children previously found (Booton et al. 2021) is not explained by differential sensitivity to wordform or meaning-level factors. Firstly, there remained a significant effect of language status on accuracy after controlling for a range of key psycholinguistic features. Secondly, no evidence was found for an interaction between language status and word or sense-level variables, in other words, there was no strong evidence here that first and second language learners are differentially sensitive to features of wordforms and word meanings when learning homonyms. It is possible that interactions do exist with relatively small effect sizes, which could not be identified in this study: indeed, the interaction with imageability approached significance, although this was difficult to interpret, and so more data are needed to examine possible subtle differences between groups. The present data at least implies that it is relatively unlikely that differential impacts of features of words (e.g. frequency) are the only factor driving the difference in homonym knowledge between EAL and EL1 children.

The findings presented here provide suggestions for vocabulary learning and teaching. They suggest that exposing children more frequently to the different meanings of homonyms could improve their understanding of their different meanings, as well as greater exposure to the wordform in general. The results also highlight that less imageable concepts are harder for children to understand, and therefore that children may need supports to visualize word meanings to help them to learn and remember them. The data affirm that EAL students tend to know fewer meanings of homonyms than EL1 students, but that this is unlikely to be due to differences in sensitivity to features of words (such as EAL students having difficulty with more infrequent meanings). This has two important implications for supporting EAL students with a few years of language exposure: firstly, that EAL students may need particular support with these kinds of words, but secondly that the form of this support may not need to be dissimilar to what EL1 students benefit from (e.g. exposure to word meanings in text).

Despite many strengths, the present research has some limitations. This study was only correlational by design, so we cannot infer causal relationships between the variables examined and children’s word knowledge. Only 32 words were sampled and 2 meanings of each of these words, and these may not be representative of all semantically ambiguous words and their meanings. Word meanings were selected to be distinct to avoid confusion, and so have relatively low relatedness, and were also tested using a visual receptive measure, so are necessarily more imageable than average. While this reduces measurement error, it also makes the words less representative of all vocabulary that children need to learn. Future research could use a broader sample of words with written or auditory response options to verify the impact of psycholinguistic features on word knowledge. One strength of the design which could also be considered a limitation is the inclusion of many psycholinguistic factors in the models: this meant that more complex versions of the model (i.e. with full random slopes) could not converge. Using a larger sample of words in future work could overcome this.

Future research should further explore the factors affecting children’s homonym learning. As well as replicating the present findings with a broader sample of polysemous words, further studies could include additional factors of interest (e.g. age of acquisition, familiarity of word meanings). To explore causal relationships, intervention studies which test approaches to developing children’s knowledge of multiple word meanings (e.g. Zipke et al. 2009; Zipke 2011) would be informative, especially ones which take into account the factors that impact meaning knowledge identified in this study.

In conclusion, this research demonstrates how a range of features of words contribute to children’s understanding of homonyms, and that meaning-level features (especially dominance and imageability) can partially explain why some meanings of words are known better than others. It suggests that children with EAL struggle with homonyms not only due to their psycholinguistic properties and that the features that make word meanings harder to learn are broadly similar for children with EAL and EL1.

The authors would like to thank the children, teachers, and parents who helped with this research. We would also like to thank Nilanjana Banerji for her help with the Oxford Children’s Corpus, and Lisa Henderson for her thoughtful questions at a conference presentation which in part motivated this work.

FUNDING

This work was supported by Ferrero International, grant number R52124/CN001.

SUPPLEMENTARY DATA

Supplementary material is available at Applied Linguistics online.

REFERENCES

Armstrong
B. C.
 
2012
.
Wordsmyth database,’ retrieved 24 August 2020, from http://blairarmstrong.net/tools/index.html#Wordsmyth.

Barr
D. J.
,
Levy
R.
,
Scheepers
C.
,
Tily
H. J.
.
2013
. ‘
Random effects structure for confirmatory hypothesis testing: Keep it maximal
,’
Journal of Memory and Language
 
68
:
255
78
.

Bates
D.
,
Mächler
M.
,
Bolker
B. M.
,
Walker
S. C.
.
2015
. ‘
Fitting linear mixed-effects models using lme4
,’
Journal of Statistical Software
 
67
:
1
48
.

Bialystok
E.
,
Luk
G.
,
Peets
K. F.
,
Yang
S.
.
2010
. ‘
Receptive vocabulary differences in monolingual and bilingual children
,’
Bilingualism (Cambridge, England)
 
13
:
525
31
.

Bleses
D.
,
Makransky
G.
,
Dale
P. S.
,
Højen
A.
,
Ari
B. A.
.
2016
. ‘
Early productive vocabulary predicts academic achievement 10 years later
,’
Applied Psycholinguistics
 
37
/
6
:
1461
76
.

Bloom
L.
,
Tinker
E.
,
Margulis
C.
.
1993
. ‘
The words children learn: Evidence against a noun bias in early vocabularies
,’
Cognitive Development
 
8
/
4
:
431
50
.

Booton
S. A.
,
Hodgkiss
A.
,
Mathers
S.
,
Murphy
V. A.
.
2021
. ‘
Measuring knowledge of multiple word meanings in children with English as a first and an additional language and the relationship to reading comprehension
,’
Journal of Child Language
. Advanced online publication.

Brysbaert
M.
,
Biemiller
A.
.
2017
. ‘
Test-based age-of-acquisition norms for 44 thousand English word meanings
,’
Behavior Research Methods
 
49
:
1520
3
.

Brysbaert
M.
,
Warriner
A. B.
,
Kuperman
V.
.
2014
. ‘
Concreteness ratings for 40 thousand generally known English word lemmas
,’
Behavior Research Methods
 
46
:
904
11
.

Buchanan
L.
,
Westbury
C.
,
Burgess
C.
.
2001
. ‘
Characterizing semantic space: Neighborhood effects in word recognition
,’
Psychonomic Bulletin and Review
 
8
/
3
:
531
44
.

Byers-Heinlein
K.
,
Werker
J. F.
.
2009
. ‘
Monolingual, bilingual, trilingual: Infants’ language experience influences the development of a word-learning heuristic
,’
Developmental Science
 
12
:
815
23
.

Campbell
J. M.
,
Bell
S. K.
,
Keith
L. K.
.
2001
. ‘
Concurrent validity of the Peabody Picture Vocabulary Test-Third Edition as an intelligence and achievement screener for low SES African American children
,’
Assessment
 
8
:
85
94
.

Carlo
M. S.
,
August
D.
,
Mclaughlin
B.
,
Snow
C. E.
,
Dressler
C.
,
Lippman
D. N.
,
Lively
T. J.
,
and White
C. E.
 
2004
. ‘
Closing the gap: Addressing the vocabulary needs of English-language learners in bilingual and mainstream classrooms
,’
Reading Research Quarterly
 
39
/
2
:
188
215
.

Chen
L.
,
Boland
J. E.
.
2008
. ‘
Dominance and context effects on activation of alternative homophone meanings
,’
Memory & Cognition
 
36
:
1306
23
.

Crossley
S.
,
Salsbury
T.
.
2010
. ‘
Using lexical indices to predict produced and not produced words in second language learners
,’
The Mental Lexicon
 
5
/
1
:
115
47
.

Danguecan
A. N.
,
Buchanan
L.
.
2016
. ‘
Semantic neighborhood effects for abstract versus concrete words
,’
Frontiers in Psychology
 
7
:
1
15
.

Davidson
D.
,
Jergovic
D.
,
Imami
Z.
,
Theodos
V.
.
1997
. ‘
Monolingual and bilingual childrens use of the mutual exclusivity constraint
,’
Journal of Child Language
 
24
:
3
24
.

De Wilde
V.
,
Brysbaert
M.
,
Eyckmans
J.
.
2020
. ‘
Learning english through out-of-school exposure: How do word-related variables and proficiency influence receptive vocabulary learning?
,’
Language Learning
 
70
:
349
81
.

Degani
T.
,
Tokowicz
N.
.
2010
. ‘
Ambiguous words are harder to learn
,’
Bilingualism
 
13
/
3
:
299
314
.

Doherty
M. J.
 
2004
. ‘
Childrens difficulty in learning homonyms
,’
Journal of Child Language
 
31
:
203
14
.

Duffy
S. A.
,
Morris
R. K.
,
Rayner
K.
.
1988
. ‘
Lexical ambiguity and fixation time in reading
,’
Journal of Memory and Language
 
27
/
4
:
429
46
.

Durda
K.
,
Buchanan
L.
.
2008
. ‘
WINDSORS: Windsor improved norms of distance and similarity of representations of semantics
,’
Behavior Research Methods
 
40
:
705
12
.

Eddington
C. M.
,
Tokowicz
N.
.
2015
. ‘
How meaning similarity influences ambiguous word processing: The current state of the literature
,’
Psychonomic Bulletin & Review
 
22
:
13
37
.

Elgort
I.
,
Warren
P.
.
2014
. ‘
L2 vocabulary learning from reading: Explicit and tacit lexical knowledge and the role of learner and item variables
,’
Language Learning
 
64
/
2
:
365
414
.

Elleman
A. M.
,
Steacy
L. M.
,
Olinghouse
N. G.
,
Compton
D. L.
.
2017
. ‘
Examining child and word characteristics in vocabulary learning of struggling readers
,’
Scientific Studies of Reading
 
21
:
133
45
.

Ellis
N. C.
,
Beaton
A.
.
1993
. ‘
Psycholinguistic determinants of foreign language vocabulary learning
,’
Language Learning
 
43
/
4
:
559
617
.

Farnia
F.
,
Geva
E.
.
2011
. ‘
Cognitive correlates of vocabulary growth in English language learners
,’
Applied Psycholinguistics
 
32
/
4
:
711
38
.

Floyd
S
,
A. E.
 
Goldberg
.
2021
. ‘
Children make use of relationships across meanings in word learning
,’
Journal of Experimental Psychology. Learning, Memory, and Cognition
 
47/
1:
29
44
. 32105145

Foraker
S.
,
Murphy
G. L.
.
2012
. ‘
Polysemy in sentence comprehension: Effects of meaning dominance
,’
Journal of Memory and Language
 
67
:
407
25
.

Garlock
V. M.
,
Walley
A. C.
,
Metsala
J. L.
.
2001
. ‘
Age-of-acquisition, word frequency, and neighborhood density effects on spoken word recognition by children and adults
,’
Journal of Memory and Language
 
45
/
3
:
468
92
.

Goodman
J. C.
,
Dale
P. S.
,
Li
P.
.
2008
. ‘
Does frequency count? Parental input and the acquisition of vocabulary
,’
Journal of Child Language
 
35
:
515
31
.

Graves
M. F.
,
Boettcher
J. A.
,
Peacock
J. L.
,
Ryder
R. J.
.
1980
. ‘
Word frequency as a predictor of students reading vocabularies
,’
Journal of Literacy Research
 
12
/
2
:
117
27
.

Hoffman
P.
,
Woollams
A. M.
.
2015
. ‘
Opposing effects of semantic diversity in lexical and semantic relatedness decisions
,’
Journal of Experimental Psychology: Human Perception and Performance
 
41
:
385
402
.

Hoover
J. R.
,
Storkel
H. L.
,
Hogan
T. P.
.
2010
. ‘
A cross-sectional comparison of the effects of phonotactic probability and neighborhood density on word learning by preschool children
,’
Journal of Memory and Language
 
63
:
100
16
.

James
E.
,
Gaskell
M. G.
,
Pearce
R.
,
Korell
C.
,
Dean
C.
,
Henderson
L. M.
.
2021
. ‘The role of prior lexical knowledge in children’s and adults’ word learning from stories’, PsyArXiv doi:10.31234/osf.io/vm5ad.

Kauschke
C.
,
Stenneken
P.
.
2008
. ‘
Differences in noun and verb processing in lexical decision cannot be attributed to word form and morphological complexity alone
,’
Journal of Psycholinguistic Research
 
37
:
443
52
.

Klepousniotou
E.
,
Titone
D.
,
Romero
C.
.
2008
. ‘
Making sense of word senses: The comprehension of polysemy depends on sense overlap
,’
Journal of Experimental Psychology: Learning, Memory, and Cognition
 
34
:
1534
43
.

Lüdecke
D.
 
2020
. ‘sjPlot: Data Visualization for Statistics in Social Science’. R package version 2.8.7, https://CRAN.R-project.org/package=sjPlot.

Lutfallah
S.
,
Fast
C.
,
Rangan
C.
,
Buchanan
L.
.
2018
. ‘
Semantic neighbourhoods there’s an app for that
,’
Mental Lexicon
 
13
/
3
:
388
93
.

Maby
M.
 
2016
. An Investigation of L2 English Learners’ Knowledge of Polysemous Word Senses. Ph.D. thesis, Cardiff University.

Maciejewski
G.
,
Rodd
J. M.
,
Mon-Williams
M.
,
Klepousniotou
E.
.
2020
. ‘
The cost of learning new meanings for familiar words
,’
Language, Cognition and Neuroscience
 
35
:
188
210
.

Masrai
A.
,
Milton
J.
.
2018
. ‘
Measuring the contribution of academic and general vocabulary knowledge to learners’ academic achievement
,’
Journal of English for Academic Purposes
 
31
:
44
57
.

McDonough
C.
,
Song
L.
,
Hirsh-Pasek
K.
,
Golinkoff
R. M.
,
Lannon
R.
.
2011
. ‘
An image is worth a thousand words: Why nouns tend to dominate verbs in early word learning
,’
Developmental Science
 
14
/
2
:
181
9
.

McFalls
E. L.
,
Schwanenflugel
P. J.
,
Stahl
S. A.
.
1996
. ‘
Influence of word meaning on the acquisition of a reading vocabulary in second-grade children
,’
Reading and Writing
 
8
/
3
:
235
50
.

Messer
M. H.
,
Leseman
P. P. M.
,
Boom
J.
,
Mayo
A. Y.
.
2010
. ‘
Phonotactic probability effect in nonword recall and its relationship with vocabulary in monolingual and bilingual preschoolers
,’
Journal of Experimental Child Psychology
 
105
:
306
23
.

Moore
R. K.
,
ten Bosch
L.
(
2009
). ‘Modelling vocabulary growth from birth to young adulthood’. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp.
1727
30
.

Nakagawa
S.
,
Johnson
P. C. D.
,
Schielzeth
H.
.
2017
. ‘
The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded
,’
Journal of the Royal Society Interface
 
14
/
134: 1-11
.

Nicoladis
E.
,
Laurent
A.
.
2020
. ‘
When knowing only one word for “car” leads to weak application of mutual exclusivity
,’
Cognition
 
196
:
104087
.

Norbury
C. F.
 
2005
. ‘
Barking up the wrong tree? Lexical ambiguity resolution in children with language impairments and autistic spectrum disorders
,’
Journal of Experimental Child Psychology
 
90
:
142
71
.

Palmer
S. D.
,
MacGregor
L. J.
,
Havelka
J.
.
2013
. ‘
Concreteness effects in single-meaning, multi-meaning and newly acquired words
,’
Brain Research
 
1538
:
135
50
.

Parent
K.
 
2012
. ‘
The most frequent english homonyms
,’
RELC Journal
 
43
:
69
81
.

Parks
R.
,
Ray
J.
,
Bland
S.
.
2020
. ‘Wordsmyth English Dictionary-Thesaurus’, retrieved 28 April 2020, from https://www.wordsmyth.net/.

Piccin
T. B.
,
Waxman
S. R.
.
2007
. ‘
Why nouns trump verbs in word learning: New evidence from children and adults in the human simulation paradigm
,’
Language Learning and Development
 
3
/
4
:
295
323
.

Powell
M. J. D.
 
2009
. ‘The BOBYQA algorithm for bound constrained optimization without derivatives,’ Technical Report NA2009/06.

R Development Core Team.

2017
.
R: A Language and Environment for Statistical Computing
.

Rodd
J. M.
,
Cai
Z. G.
,
Betts
H. N.
,
Hanby
B.
,
Hutchinson
C.
,
Adler
A.
.
2016
. ‘
The impact of recent and long-term experience on access to word meanings: Evidence from large-scale internet-based experiments
,’
Journal of Memory and Language
 
87
:
16
37
.

Rodd
J. M.
,
Gaskell
G.
,
Marslen-Wilson
W.
.
2002
. ‘
Making sense of semantic ambiguity: Semantic competition in lexical access
,’
Journal of Memory and Language
 
46
/
2
:
245
66
.

Rowe
M. L.
,
Raudenbush
S. W.
,
Goldin-Meadow
S.
.
2012
. ‘
The pace of vocabulary growth helps predict later vocabulary skill
,’
Child Development
 
83
:
508
25
.

Schuth
E.
,
Köhne
J.
,
Weinert
S.
.
2017
. ‘
The influence of academic vocabulary knowledge on school performance
,’
Learning and Instruction
 
49
:
157
65
.

Sénéchal
M.
,
Pagan
S.
,
Lever
R.
,
Ouellette
G. P.
.
2008
. ‘
Relations among the frequency of shared reading and 4-year-old children’s vocabulary, morphological and syntax comprehension, and narrative skills
,’
Early Education and Development
 
19
/
1
:
27
44
.

Shaoul
C.
,
Westbury
C.
.
2010
. ‘
Exploring lexical co-occurrence space using HiDEx
,’
Behavior Research Methods
 
42
:
393
413
.

Stadthagen-Gonzalez
H.
,
Davis
C. J.
.
2006
. ‘
The Bristol norms for age of acquisition, imageability, and familiarity
,’
Behavior Research Methods
 
38
/
4
:
598
605
.

Storkel
H. L.
 
2009
. ‘
Developmental differences in the effects of phonological, lexical and semantic variables on word learning by infants
,’
Journal of Child Language
 
36
:
291
321
.

Tabossi
P.
,
Colombo
L.
,
Job
R.
.
1987
. ‘
Accessing lexical ambiguity: Effects of context and dominance
,’
Psychological Research
 
49
/
2–3
:
161
7
.

Vaden
K. I.
,
Halpin
H. R.
,
Hickok
G. S.
.
2009
. ‘Irvine Phonotactic Online Dictionary, Version 2.0. [Data file]’. Retrieved 24 August 2020, from http://www.iphod.com.

van Heuven
W. J. B.
,
Mandera
P.
,
Keuleers
E.
,
Brysbaert
M.
.
2014
. ‘
SUBTLEX-UK: A new and improved word frequency database for British English
,’
Quarterly Journal of Experimental Psychology
 
67
/
6
:
1176
90
.

Waxman
S.
,
Fu
X.
,
Arunachalam
S.
,
Leddon
E.
,
Geraghty
K.
,
Song
H. J.
.
2013
. ‘
Are nouns learned before verbs? Infants provide insight into a long-standing debate
,’
Child Development Perspectives
 
7
/
3
:
155
9
.

Wechsler
D.
 
2016
.
Wechsler Intelligence Scale for Children–Fifth UK Edition
.
Harcourt Assessment
.

Zipf
G. K.
 
1945
. ‘
The meaning-frequency relationship of words
,’
The Journal of General Psychology
 
33
:
251
6
.

Zipke
M.
 
2011
. ‘
First graders receive instruction in homonym detection and meaning articulation: The effect of explicit metalinguistic awareness practice on beginning readers
,’
Reading Psychology
 
32
:
349
71
.

Zipke
M.
,
Ehri
L. C.
,
Cairns
H. S.
.
2009
. ‘
Using semantic ambiguity instruction to improve third graders’ metalinguistic awareness and reading comprehension: An experimental study
,’
Reading Research Quarterly
 
44
/
3
:
300
21
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data