-
PDF
- Split View
-
Views
-
Cite
Cite
Sascha Wolfer, Thomas Bartz, Tassja Weber, Andrea Abel, Christian M Meyer, Carolin Müller-Spitzer, Angelika Storrer, The Effectiveness of Lexicographic Tools for Optimising Written L1-Texts, International Journal of Lexicography, Volume 31, Issue 1, March 2018, Pages 1–28, https://doi.org/10.1093/ijl/ecw038
- Share Icon Share
Abstract
We present an empirical study addressing the question whether, and to which extent, lexicographic writing aids improve text revision results. German university students were asked to optimise two German texts using (1) no aids at all, (2) highlighted problems, or (3) highlighted problems accompanied by lexicographic resources that could be used to solve the specific problems. We found that participants from the third group corrected the largest number of problems and introduced the fewest semantic distortions during revision. Also, they reached the highest overall score and were most efficient (as measured in points per time). The second group with highlighted problems lies between the two other groups in almost every measure we analysed. We discuss these findings in the scope of intelligent writing environments, the effectiveness of writing aids in practical usage situations and teaching dictionary skills.
1. Introduction
For many years, lexicographers and dictionary publishers have claimed that the use of lexicographic resources is beneficial for resolving language-related problems. However, evaluating the efficiency of a lexicographic resource is notoriously difficult and previous efforts have relied on a single dictionary project or used an artificial lookup situation that is only distantly related to the practical language-related problems of users. Solving these issues is essential to the proper assessment of the efficiency of lexicographic tools in practical usage situations such as text reception and production.
The experimental study reported in this paper is our contribution to filling this gap. We focus on text optimization in an academic context and explore the extent to which lexicographic writing aids improve text revision results. Our experiment is a novel kind of dictionary usage study involving multiple types of lexicographic resources and avoiding an overly artificial lookup situation. The results of our study suggest that quality of textual revisions can significantly benefit from this link.
The remainder of this paper is structured as follows. In Section 2, we relate our study to previous work. In Section 3 we define our guiding research questions and hypotheses. Section 4 provides a description of the study design, the materials and the methods we used. Our main findings are described in Section 5 and discussed in Section 6. In Section 7, we assess the impact of these findings on the goal of linking writing environments with lexicographic resources, and on teaching dictionary skills.
2. Related work
Our work is closely related to previous research efforts from three different research areas: (a) utilising dictionaries to solve language-related problems, (b) analysing the effectiveness of writing aids in practical usage situations, and (c) computer-assisted language learning (CALL) and intelligent writing environments.
2.1. Utilising dictionaries to solve language-related problems
We have analysed the effectiveness of German electronic dictionaries and grammars when optimising texts in an online writing environment. The use of dictionaries in CALL systems is not a new phenomenon although there is little literature on the topic (Abel 2010).
It has long been claimed that lexicographic resources are useful for text production tasks. Rundell (1999: 50) argues, for instance, that monolingual dictionaries “have the potential of being an invaluable resource at any stage in a productive task”. An important issue is the identification of a relevant dictionary type and dictionary entry for a given language-related problem. This requires an understanding of the current writing context, which has only been cursorily researched. Seretan and Wehrli (2013) survey context-specific dictionary lookup procedures based on NLP methods, such as automatic lemmatisation, part of speech tagging and morphological analysis. We are, however, not aware of any work that considers the type of language-related problem or the surrounding words for identification of a relevant entry. In this paper, we therefore manually looked up relevant dictionary entries and used them as part of the experimental stimuli.
Tono, Satake, and Miura (2014) investigated the effects of using corpus evidence for text revision processes. They presented their participants with a corpus query tool and asked them to use the tool to revise texts that were written by the participants themselves three weeks earlier. The results suggest that some error types (omission, addition) are easier to identify and correct if the participants consult the corpus, but this does not apply to misinformation errors. Apart from methodological differences, this study is similar to ours because it uses a text revision context with accompanying language resources. The main difference to our study, however, is the type of resource provided to the participants (corpus-based vs. dictionary-based resources).
Yoon (2016) carried out a study on a small number of Korean ESL students who were given a writing assignment. They were asked to use different concordancers (such as the COCA corpus concordancer) as well as the Google search engine and different online reference works (e.g., Roget’s Thesaurus, LDOCE). Their results show that bilingual dictionaries as well as the COCA concordancer were used most frequently, and that consulting the resources in general proved to be helpful during the writing task. The findings suggest that advanced L2 learners in academic settings may benefit from the combined use of different language reference tools. The difference to our study lies in the fact that we do not contrast the effectiveness of concordancers vs. dictionaries, and that we explicitly provide relevant dictionary entries to language-related problems.
2.2. Effectiveness of writing aids in practical usage situations
By explicitly providing the relevant dictionary entries, we reiterate the claim that dictionaries should adapt to people’s language-related needs (Bergenholtz and Tarp 2003: 172), because the original aim of dictionaries is to be a suitable tool in situations in which communicatively or cognitively oriented linguistic questions or difficulties arise (Wiegand et al. 2010: 98–99).
The goal of research into dictionary use is to contribute empirical knowledge on how this adaption can be achieved. Most studies concentrate on current lexicographic practice and evaluate particular features of dictionaries as well as the design of individual items, or collect information (e.g. via questionnaire studies) on what users like about dictionaries and what is less important for them (for overviews see e.g. Lew 2011, Müller-Spitzer 2014, Töpel 2014, Welker 2010, Welker 2013).
Recently, Chen (2016) showed that the integration of an online dictionary in a CALL system helped participants to improve their productive collocation knowledge. However, participants’ performances “were still not satisfactory” because “participants showed inadequate dictionary use skills” (ibid., p. 1). Without going into details about the participants’ problems, we want to support Chen’s claim that “teachers should provide instructions to improve learners’ dictionary use skills” (ibid., p. 22). Chi (1998: 565) also concluded this in a similar vein: “Both lexicographers and publishers have over-estimated the knowledge, ability and the level of persistence students would need in order to teach themselves how to use a dictionary”
In addition to research into actual dictionary use, it is also important to identify and examine language-related tasks that need to be managed in everyday life and, “as a starting point”, focus on “the language problem rather than the dictionary” (Frankenberg-Garcia 2011: 121, cf. also Pearsall 2013). However, it is difficult to create a setting which is close to an ordinary language task on the one hand and which, on the other hand, allows quantitative evaluation of dictionary usage for this task. Quantitative evaluation means that the collected data has to be comparable across participants and measurement errors have to be as systematic as possible.
In this study, we rise to the challenge with the help of an experimentally controlled study. The setting we created was (at least in part) close to an ordinary working situation: students optimised two texts on their own laptops. However, the artificial elements included the fact that they had to perform the task in a lecture room and were only allowed to use the aids we provided. We concentrated on L1 users, because the claim that “next to nothing is known when it comes to the use that is made of dictionaries by L1 users” (Bogaards 2003: 28, as exceptions see Klosa et al. 2014, Müller-Spitzer et al. 2014) still holds. We hope to contribute a new type of study to research into dictionary use in the way recommended by e.g. Levy and Steel: “The study reported here, with data drawn from a large-scale survey, reports on what students say they do when using electronic dictionaries. This reportage does not necessarily reflect what students actually do […]. Smaller-scale studies are needed to complement and enrich the findings of the present study” (Levy and Steel 2015: 194).
2.3. Computer-assisted language learning and intelligent writing environments
Digital environments and tools to help users produce and optimise texts have been primarily discussed in the context of computer-assisted language learning (CALL) and intelligent computer-assisted language learning (ICALL), in which techniques from natural language processing (NLP) and artificial intelligence (AI) are used to analyse learner language (cf. Abel 2010, Gamper and Knapp 2002, Knapp 2004, Meurers 2013). So far, the main target groups of (I)CALL are second (L2) and foreign (FL) language learners. NLP in intelligent systems is used to analyse both L2 and FL learner languages at different linguistic levels (ICALL), and native language (L1) for L2/FL learners deriving from the need to expose them to authentic L1 language (Authentic Text ICALL, referred to as ATICALL, cf. Meurers 2013). In our study, the target group consists of L1 speakers developing their writing abilities in their native language.
Recently, there has been tremendous progress in the development of ICALL tools aimed at supporting the writing process. The Glosser system (Villalón et al. 2008) displays so-called trigger questions as a scaffolding strategy to initiate a reflection process on the quality of written text. Criterion (Burstein et al. 2003) uses techniques from automated essay evaluation to identify problems in a text and generate basic explanations of the issue as well as suggestions for correction. The Writing Aid Dutch (De Wachter et al. 2014) is similar, but additionally provides text-related statistics, such as readability scores, frequently used words, etc. The Cambridge English Write & Improve (http://sat.ilexir.co.uk) and the LightSide Revision Assistant (http://lightsidelabs.com/ra/) systems are two examples of integrating such research results into fully-fledged systems for assessing and improving English writing skills. As opposed to our work, none of these systems present relevant dictionary entries or excerpts from a grammar reference. The goal of our work is instead to analyse the effectiveness of providing lexicographic resources in comparison to having no revision aid at all and simply highlighting problems in a similar way as in Criterion and the Writing Aid Dutch. The majority of existing (I)CALL tools use English as the main language. Little is known about the effectiveness of such tools for languages such as German, which we will focus on in this paper.
To date, NLP systems have been useful for automatic detection of simple language-related problems similar to our “highlighted” condition (see below), but they often fail to provide helpful explanations about why something is wrong, which hinders the writer from improving his or her writing skills. More complex problems, such as the use of inappropriate registers have, to our knowledge, not yet been addressed by automatic approaches. However, providing meaningful feedback is crucial in the context of language teaching. Research into (manual) text revision underpins this idea. Fix (2004: 307–319) found that complex language problems are usually not identified during text revision. In addition, identified problems often cannot be optimised because writers lack appropriate executive strategies and critical language awareness.
3. Research question and hypotheses
The general research question for our study is: Do lexicographic resources have a positive effect in a text revision task with L1 users or is it sufficient to mark language problems and rely on the users’ language skills? For our experiment, described in the following chapter, we identified 35 language problems in two texts. For each of these problems, we chose snippets of authentic lexicographic resources which could, in principle, help to find a more appropriate formulation. Providing lexicographic resources for language problems inevitably means that the problems are explicitly marked as such. We thus created a setting that allows us to compare the effects of lexicographic resources with the effects that can be obtained by simply marking the language problems. Consequently, we prepared three distinct versions of the two texts: (1) a text-only version without additional aids; (2) a version in which all language-related problems were highlighted, but no additional resources were provided; (3) a full version in which all problems were highlighted and linked to the lexicographic resources. We tested the following hypotheses:
H1: Marking problems is helpful for text revision, i.e., the revision results of the two text versions with highlighted language problems (versions 2 and 3) yield a higher quality of revisions than the results of the text-only version (1).
H2: Lexicographic resources have additional positive effects on revision quality, i.e. participants who receive the full version (3) outperform participants using the highlighted problems version (2) and the text-only version (1).
4. Method
4.1. Design
We employed a 2x3 mixed design. The first factor “Text” included the two levels “Youth” and “Phraseology” (see Section 4.3 for a detailed description, and Section 9 for the text excerpts we presented to the participants). We varied the factor “Text” within participants, so that each participant saw both texts. The order in which the texts were presented was chosen randomly. The second factor “revision aid” had three levels “only text”, “highlighted” and “full”. The factor was varied between-participants, i.e. each participant was randomly assigned to one of the groups and saw both texts in the same revision aid condition.
4.2. Participants
Our participants were undergraduate students of German linguistics at a German university. Participation in the study was a course requirement. Hence, our group of participants was relatively homogeneous in terms of university subject. For this study, we consider this an advantage because it keeps certain participant-related variables constant. In a follow-up study, it would be interesting to see how students from different subjects or a completely different population would behave. We collected responses from 105 participants. In total, 26 participants stated that German was not their native language, which is why we excluded their data from subsequent analyses. One participant was excluded because she or he took less than five minutes to revise both texts. The final number of data sets was 78. According to the self-reports at the end of the study, the vast majority (71 participants, 91.0%) were in their first semester of linguistics, three participants were in the third semester, and one participant was in the eighth semester. In regard to the factor revision aid, the data sets are distributed as follows: 26 participants were in the “only text” revision aid condition, 25 in the “highlighted” condition and 27 in the “full” condition.
According to their self-reports, 17 participants (21.8%) used monolingual dictionaries “at least once per week”, 23 participants (29.5%) “at least once per month”, 24 participants (30.8%) “at least once in half a year” and 14 participants (17.9%) “less frequently or never”. These categories are evenly distributed over the experimental conditions “only text”, “highlighted”, and “full” as indicated by a Chi2-test (Chi2(6) = 3.41, p = .76). Hence no effects of experimental condition that are reported below are attributable to the participants’ different levels of dictionary usage experience.
4.3. Material
The procedure for creating and annotating the material of our study is visualised schematically in Figure 1. We presented two text excerpts to the participants. The text excerpt “Youth” was taken from the KoKo Corpus (Abel et al. 2014). It was authored by a 12th grade high school student and consists of 260 words. It contains 20 language problems that were identified by the authors of the present paper. The “Phraseology” text excerpt was taken from the introduction of a term paper by a student of German philology at the University of Dortmund. It consists of 204 words and contains 15 language problems. Each of the texts was distributed over two screen pages to fit both the original text and the editing field below the text on one screen (see Figure 2 as an example of a one screen page). In terms of age and education, our participants were very similar to the author of the “Phraseology” text. This is not the case for the “Youth” text, since we consider university students to be more advanced than high-school students in terms of education and maybe language ability.

Procedure for creating and annotating the material. Two text excerpts with a total of 35 identified language problems were formatted to realise the three revision aid conditions (versions 1 through 3). Two annotators used the model solutions as a reference during the annotation process. The results from the annotation were the basis of the statistical analyses.

Sample stimuli screen from the “full” revision aid condition. The source text with highlighted language problems (only in “highlighted” and “full” conditions) with bold references to the lexicographic resources (only in “full” condition) is in the top left corner. In the bottom left corner is a text editing area (“edit box”) where participants can revise the sample text. On the right, there are the lexicographic resources linked with the bold references in the sample text (only in the “full” condition). The right area was scrollable to access the other resources. The left area of the screen was not scrollable to ensure that both the source text and its revised version were visible at any time.
4.4. Revision aids
For each text, we created three versions: An “only text” version without any highlighted text parts or lexicographic resources, a “highlighted” version where language problems were coloured in yellow and a “full” version with highlighted problems and corresponding entries from lexicographic resources. Figure 2 shows the second page of the “Youth” text from the “full” revision aid condition. To help the participants associate the highlighted language problems with the corresponding lexicographic resource, we introduced a running number printed in bold after each highlighted problem and next to each lexicographic resource. These numbers were the only change we made to the text material itself. Each language problem was paired with one specific lexicographic resource. So, each participant in the “full” revision aid condition saw the same lexicographic resource for a specific problem.
4.5. Lexicographic resources
The lexicographic resources taken from the online dictionaries and grammars listed below were introduced by brief descriptions a) of the corresponding language problem highlighted in the text and b) of the lexicographic resource displayed (see Figure 2). In order to avoid visual overload due to different layouts of the original resources or advertisements, we reduced the dictionary entries, information panels, and grammar sections to their essential structure preserving the familiar formatting and colour schemes. In some cases, we also reduced the information provided by the resources depending on the corresponding language problem. This includes a priori word sense disambiguation and removal of information related to irrelevant word senses.
The lexicographic entries were taken from the following online dictionaries and grammars:
Duden (“Duden Universal German Dictionary”, http://www.duden.de; the sections “Bedeutungen, Beispiele und Wendungen”, engl. sense-related items, examples and collocations, „Gebrauch“, engl. usage notes, and “Synonyme”, engl. synonyms, were used)
dwds.de (“Digitales Wörterbuch der deutschen Sprache”, http://www.dwds.de; the components “DWDS-Wortprofil 3.0”, an overview of significant collocations of a queried word, and “GermaNet”, an overview of semantically related words, e.g. hyponyms/hypernyms, were used)
E-VALBU (“Elektronisches Valenzwörterbuch deutscher Verben”, http://hypermedia.ids-mannheim.de/evalbu, a verb valency dictionary)
canoo.net (http://www.canoo.net; the components “Wortgrammatik”, engl. word grammar, and “Satzgrammatik”, engl. sentence grammar, were used)
grammis 2.0 (http://hypermedia.ids-mannheim.de/, an online grammar based on the “Grammatik der deutschen Sprache” (Zifonun et al. 1997)
Table 1 relates the types of language problems highlighted in the text excerpts (cf. Appendix 1) to the lexicographic resources used in our study and to the relevant information types chosen from these resources. The types of language problems selected for the study concern different linguistic levels and features that are crucial with regard to text quality, and, consequently, to the skills necessary to produce a text that meets the respective quality criteria (Becker-Mrotzek and Böttcher 2006, Nussbaumer and Sieber 1994).
Lexicographic resources and information used in the study classified by type of language problem highlighted in the text excerpts of the study.
Problem type . | Index in text extract . | Lexicographic resource . | Information . | |
---|---|---|---|---|
. | “Youth” . | “Phraseology” . | ||
Syntax | 5, 7, 15, 16, 17 | canoo.net | sentence grammar, phrase structures, mood/tense | |
2, 12, 18 | E-VALBU | sentence structures | ||
1 | dwds.de/DWDS- Wortprofil 3.0 | complex phrases | ||
Lexis | 4, 6, 8, 10, 11, 13, 14, 19, 20 | 3, 7, 14 | Duden | sense-related items, examples, collocations, usage notes, synonyms |
3 | 1, 4, 9, 10, 13, 15 | dwds.de/DWDS- Wortprofil 3.0 | collocations, complex phrases | |
8 | dwds.de/GermaNet | semantically related words | ||
Text structure | 9 | 2, 5, 6, 11, 12 | canoo.net grammis 2.0 | text grammar anaphora/deixis |
Problem type . | Index in text extract . | Lexicographic resource . | Information . | |
---|---|---|---|---|
. | “Youth” . | “Phraseology” . | ||
Syntax | 5, 7, 15, 16, 17 | canoo.net | sentence grammar, phrase structures, mood/tense | |
2, 12, 18 | E-VALBU | sentence structures | ||
1 | dwds.de/DWDS- Wortprofil 3.0 | complex phrases | ||
Lexis | 4, 6, 8, 10, 11, 13, 14, 19, 20 | 3, 7, 14 | Duden | sense-related items, examples, collocations, usage notes, synonyms |
3 | 1, 4, 9, 10, 13, 15 | dwds.de/DWDS- Wortprofil 3.0 | collocations, complex phrases | |
8 | dwds.de/GermaNet | semantically related words | ||
Text structure | 9 | 2, 5, 6, 11, 12 | canoo.net grammis 2.0 | text grammar anaphora/deixis |
Lexicographic resources and information used in the study classified by type of language problem highlighted in the text excerpts of the study.
Problem type . | Index in text extract . | Lexicographic resource . | Information . | |
---|---|---|---|---|
. | “Youth” . | “Phraseology” . | ||
Syntax | 5, 7, 15, 16, 17 | canoo.net | sentence grammar, phrase structures, mood/tense | |
2, 12, 18 | E-VALBU | sentence structures | ||
1 | dwds.de/DWDS- Wortprofil 3.0 | complex phrases | ||
Lexis | 4, 6, 8, 10, 11, 13, 14, 19, 20 | 3, 7, 14 | Duden | sense-related items, examples, collocations, usage notes, synonyms |
3 | 1, 4, 9, 10, 13, 15 | dwds.de/DWDS- Wortprofil 3.0 | collocations, complex phrases | |
8 | dwds.de/GermaNet | semantically related words | ||
Text structure | 9 | 2, 5, 6, 11, 12 | canoo.net grammis 2.0 | text grammar anaphora/deixis |
Problem type . | Index in text extract . | Lexicographic resource . | Information . | |
---|---|---|---|---|
. | “Youth” . | “Phraseology” . | ||
Syntax | 5, 7, 15, 16, 17 | canoo.net | sentence grammar, phrase structures, mood/tense | |
2, 12, 18 | E-VALBU | sentence structures | ||
1 | dwds.de/DWDS- Wortprofil 3.0 | complex phrases | ||
Lexis | 4, 6, 8, 10, 11, 13, 14, 19, 20 | 3, 7, 14 | Duden | sense-related items, examples, collocations, usage notes, synonyms |
3 | 1, 4, 9, 10, 13, 15 | dwds.de/DWDS- Wortprofil 3.0 | collocations, complex phrases | |
8 | dwds.de/GermaNet | semantically related words | ||
Text structure | 9 | 2, 5, 6, 11, 12 | canoo.net grammis 2.0 | text grammar anaphora/deixis |
With the help of the lexicographic resources, participants may become aware of the reasons why particular formulations are highlighted as “problems”. For example, participants might not be aware that the word “Bub”, a standard German variant to refer to a “boy” as alternative to the term “Junge” (see text excerpt “Youth” in Appendix 1, problem number 13), is only common in southern Germany, Austria, Switzerland, and South Tyrol (cf. Ammon et al. 2004). Readers from other parts of the German speaking area possibly mistake the regional term “Bub” for the similarly written word “Bube” which means “knave” or “jack”. Thus, lexis in this case may lead to comprehension difficulties. This is why the more appropriate term to use in this context would be the term “Junge” (“boy”). Diasystematic information about national, regional, archaic, etc. usage of an expression is provided by dictionaries like the Duden Universal German Dictionary used in the study. For the example term “Bub”, the lexicographic resource lists diasystematic information as well as alternative (standard) terms that users can take into account when revising the text (see Figure 2).
4.6. Procedure
4.6.1 Data collection
We used one lecture slot for data collection. We distributed the participants randomly among two large lecture rooms. When the students entered the room, they were randomly assigned to one of the three revision aid conditions and received the study system’s corresponding URL. The three conditions (only text, highlighted, full) were implemented as three different projects with different URLs. The participants were seated with enough space (at least two seats) between them and were asked to complete the study on their own computers. We asked the participants to work quietly on the task and to not interact with each other during the experiment. At least three supervisors were present in both rooms at all times. We used proprietary web-based questionnaire software for data collection. Opening other browser windows or using other devices was not allowed.
The participants were first presented with detailed instructions on how to proceed with the experiment. They were asked to imagine a scenario in which a fellow student gives them a text with the request to carefully read it, identify language problems, and correct them. We told them to preserve the meaning of the whole text and the segments they revise. In the “highlighted” and the “full” conditions, we additionally told the participants that they were not required to find a better version of a highlighted text segment at all costs, but that they should aim for the linguistically best revision of each text.
After completing the study, the participants were asked to remain seated until all other participants were finished.
4.6.2. Annotation of revisions
The aim of our study was to assess the effectiveness and efficiency of lexicographic tools for text optimisation. To evaluate this research question, we needed to obtain data on which of the participants’ changes improve the original and which of them agree with the recommendations of the corresponding lexicographic resource. We asked two human raters to manually annotate the success of the revisions proposed in the 78 data sets. We compiled detailed annotation guidelines explaining the annotation categories. Another document contained multiple examples for each language problem illustrating difficult cases (“model solutions” in Figure 1). For each language problem in each data set (35 problems * 78 participants = 2,730 instances of revisions), the raters annotated the following dimensions:
Change/Modification: Was the text element with a language problem modified by the participant? Possible values were “yes“ and “no”.
Improvement/Success: Was the problem improved (i.e., repaired), did the text section remain inaccurate or was the revised version even worse? Obviously, whenever a participant did not attempt to solve the problem at all, it was still annotated as inaccurate. We also annotated whether revisions led to semantic distortion, i.e. whether the participants changed the semantic content of the text segment. Possible values were “improved/resolved”, “no change”, “deteriorated” and “semantic content distorted”. In the statistical analyses (see Section 5), we have grouped this dimension with the variables “improved or not” and “semantic content distorted or not”.
After both raters finished the annotations, we identified all cases in which the raters disagreed. Mean Cohen’s kappa for the initial annotation run (i.e. before any critical cases were discussed) was kmean = 0.819. The mean kappa was 0.900 for Change/Modification and 0.740 for Improvement/Success. Carletta (1996: 252) referring to Krippendorf (1980), describes kappa scores above 0.8 as “good reliability”.
All divergent cases were discussed between the annotators with the help of an additional adjudicator. In the last step, the annotators agreed on a single annotation in all cases. In five cases (0.18% of all cases), the annotators were still unsure about how to rate the improvement of edited text passages. These cases have been resolved by the authors of this paper.
5. Results
In the following three sections (5.1 through 5.3), we first analyse the annotated variables (changes, improvements, semantic distortions). In Sections 5.4 and 5.5, we investigate the performance of the participants in the different conditions by means of participant-based scoring measures. Note that we only analyse changes that are associated with the 35 language problems we identified beforehand. Changes to other text parts were not taken into account.
5.1. Changes
As can be seen in Table 2, participants were more likely to revise language problems in the “Youth” text than in the “Phraseology” text (marginal means: 0.74 vs. 0.58). We note clear differences between the different revision aid conditions. When participants were only provided with the raw text, revisions were least probable (0.36). Conversely, revisions were most probable when participants were provided with highlights and lexicographic resources (full revision aid, 0.89). The condition with highlighted problems but without lexicographic resources lies between these two conditions (0.75). The overall probability of changing a text element was 0.67 (67%).
Mean probabilities (and standard errors, SE, in parentheses) of changing the text elements containing language problems. If multiplied by 100, figures can be read as percentages.
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.48 (0.02) | 0.21 (0.02) | 0.36 (0.02) |
Highlighted | 0.84 (0.02) | 0.65 (0.02) | 0.75 (0.01) | |
Full | 0.90 (0.01) | 0.88 (0.02) | 0.89 (0.01) | |
Marginal means | 0.74 (0.01) | 0.58 (0.01) | 0.67 (0.009) |
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.48 (0.02) | 0.21 (0.02) | 0.36 (0.02) |
Highlighted | 0.84 (0.02) | 0.65 (0.02) | 0.75 (0.01) | |
Full | 0.90 (0.01) | 0.88 (0.02) | 0.89 (0.01) | |
Marginal means | 0.74 (0.01) | 0.58 (0.01) | 0.67 (0.009) |
Mean probabilities (and standard errors, SE, in parentheses) of changing the text elements containing language problems. If multiplied by 100, figures can be read as percentages.
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.48 (0.02) | 0.21 (0.02) | 0.36 (0.02) |
Highlighted | 0.84 (0.02) | 0.65 (0.02) | 0.75 (0.01) | |
Full | 0.90 (0.01) | 0.88 (0.02) | 0.89 (0.01) | |
Marginal means | 0.74 (0.01) | 0.58 (0.01) | 0.67 (0.009) |
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.48 (0.02) | 0.21 (0.02) | 0.36 (0.02) |
Highlighted | 0.84 (0.02) | 0.65 (0.02) | 0.75 (0.01) | |
Full | 0.90 (0.01) | 0.88 (0.02) | 0.89 (0.01) | |
Marginal means | 0.74 (0.01) | 0.58 (0.01) | 0.67 (0.009) |
To test the raw differences for statistical significance, we calculated a logistic mixed-effects model using the lme4 package (Bates et al. 2015) within the statistical computing environment R (R Core Team 2015). In a mixed-effects model, it is possible to incorporate random effects into a regression equation controlling for individual effects of participants and stimulus items. In a logistic mixed-effects model, a linking function from the binomial family is chosen to fit the statistical model to the data. Jaeger (2008) shows that this yields more precise results when dealing with binary data. When controlling for individual effects of participants and stimulus items by including random intercepts for both in the analysis, the regression model indicates significant differences between the two texts (Phraseology vs. Youth, β = -1.72, SE = 0.38, z = -4.49, p < .0001) and all revision aid conditions (highlighted vs. only text, β = 2.50, SE = 0.47, z = 5.35, p < .0001; full vs. only text, β = 3.48, SE = 0.48, z = 7.21, p < .0001; full vs. highlighted, β = 0.98, SE = 0.49, z = 1.99, p = .047). The significant interaction between the factors text and revision aid points to the fact that there is only a difference between the texts in the “only text” condition and the “highlighted” conditions, but not in the “full” condition (β = 1.43, SE = 0.33, z = 4.40, p < .0001). Figure 3 visualises the relationship between the two factors and the probability of modifying a language problem.

Mean probabilities of revising language problems. Error bars indicate one standard error. Figures at the bottom of the bars indicate the number of language problems contributing to the respective bar.
5.2. Improvements
To analyse the probability of improving language problems, we selected only cases that were changed by the participants. Excluding cases without changes reduces the number of cases in the data set from 2,730 to 1,838 observations. Table 3 provides an overview of the raw probabilities.
Mean probabilities (and standard errors in parentheses) of improving the text elements containing language problems.
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.59 (0.03) | 0.57 (0.05) | 0.59 (0.03) |
Highlighted | 0.67 (0.02) | 0.60 (0.03) | 0.64 (0.02) | |
Full | 0.77 (0.02) | 0.74 (0.02) | 0.76 (0.01) | |
Marginal means | 0.69 (0.01) | 0.67 (0.02) | 0.68 (0.01) |
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.59 (0.03) | 0.57 (0.05) | 0.59 (0.03) |
Highlighted | 0.67 (0.02) | 0.60 (0.03) | 0.64 (0.02) | |
Full | 0.77 (0.02) | 0.74 (0.02) | 0.76 (0.01) | |
Marginal means | 0.69 (0.01) | 0.67 (0.02) | 0.68 (0.01) |
Mean probabilities (and standard errors in parentheses) of improving the text elements containing language problems.
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.59 (0.03) | 0.57 (0.05) | 0.59 (0.03) |
Highlighted | 0.67 (0.02) | 0.60 (0.03) | 0.64 (0.02) | |
Full | 0.77 (0.02) | 0.74 (0.02) | 0.76 (0.01) | |
Marginal means | 0.69 (0.01) | 0.67 (0.02) | 0.68 (0.01) |
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.59 (0.03) | 0.57 (0.05) | 0.59 (0.03) |
Highlighted | 0.67 (0.02) | 0.60 (0.03) | 0.64 (0.02) | |
Full | 0.77 (0.02) | 0.74 (0.02) | 0.76 (0.01) | |
Marginal means | 0.69 (0.01) | 0.67 (0.02) | 0.68 (0.01) |
We observe similar effects as for the changes. However, the differences are smaller. The statistical significance test was the same as in the previous section. We excluded the interaction from the model because it did not significantly improve the model and we were aiming for the most parsimonious model. The difference between the two texts is also not significant (p > .1). In contrast, the differences between the revision aid conditions are still highly significant (full vs. only text, β = 1.06, SE = 0.17, z = 6.11, p < .0001; full vs. highlighted, β = 0.74, SE = 0.15, z = 5.01, p < .0001), whereas the difference between the highlighted vs. only text conditions is only marginally significant (β = 0.32, SE = 0.17, z = 1.87, p = .061).
Figure 4 visualises the influence of the two factors on the probability of improving a language problem. Note that the number of data points (bottom of the bars) is lower than in Figure 3 because we only considered cases in which participants actually changed the respective text parts. Figure 4 suggests that it makes sense to exclude the interaction effect because the differences between the two texts (dark vs. light bars) are overall quite similar.

Mean probabilities of improving language problems. Error bars indicate one standard error. Figures at the bottom of the bars indicate the number of language problems contributing to the respective bar.
5.3. Semantic distortions
As we described in Section 4.6.2, we also annotated whether changes to language problems led to a change of semantic content of the revised text segments. This can be considered a special case of revision failure. In our dataset, this happened in a considerable number of times (329 of 1,838; roughly 18%) of the modified language problems. We therefore decided to analyse semantic distortions separately. If lexicographic resources are truly beneficial to text revision, then participants should not only detect and resolve more problems, they should also introduce less semantic distortions than in the other versions. Again, only those cases were considered in which participants actually changed the text segment containing a language problem.
Table 4 summarises the results for the two factors. Again, the test for significance shows no significant interaction between text and revision aid condition. Also, the difference between the “Youth” and “Phraseology” texts did not prove significant (p > .1). This is surprising given the raw values indicated by the marginal means at the bottom of Table 4 (0.21 vs. 0.13). The non-significance is most likely due to too much inter-individual noise between participants and/or items. In contrast, all differences between the revision aid conditions are significant (highlighted vs. only text, β = -0.57, SE = 0.21, z = -2.76, p = .006; full vs. only text, β = -1.30, SE = 0.21, z = -6.22, p < .0001; full vs. highlighted, β = -0.73, SE = 0.18, z = 4.04, p < .0001).
Mean probabilities (and standard errors in parentheses) of introducing semantic distortions by revising language problems.
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.30 (0.03) | 0.23 (0.05) | 0.28 (0.02) |
Highlighted | 0.21 (0.02) | 0.17 (0.02) | 0.20 (0.02) | |
Full | 0.15 (0.02) | 0.09 (0.02) | 0.13 (0.01) | |
Marginal means | 0.21 (0.01) | 0.13 (0.01) | 0.18 (0.009) |
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.30 (0.03) | 0.23 (0.05) | 0.28 (0.02) |
Highlighted | 0.21 (0.02) | 0.17 (0.02) | 0.20 (0.02) | |
Full | 0.15 (0.02) | 0.09 (0.02) | 0.13 (0.01) | |
Marginal means | 0.21 (0.01) | 0.13 (0.01) | 0.18 (0.009) |
Mean probabilities (and standard errors in parentheses) of introducing semantic distortions by revising language problems.
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.30 (0.03) | 0.23 (0.05) | 0.28 (0.02) |
Highlighted | 0.21 (0.02) | 0.17 (0.02) | 0.20 (0.02) | |
Full | 0.15 (0.02) | 0.09 (0.02) | 0.13 (0.01) | |
Marginal means | 0.21 (0.01) | 0.13 (0.01) | 0.18 (0.009) |
. | Text . | Marginal means . | ||
---|---|---|---|---|
Youth . | Phraseology . | |||
Revision aid | Only Text | 0.30 (0.03) | 0.23 (0.05) | 0.28 (0.02) |
Highlighted | 0.21 (0.02) | 0.17 (0.02) | 0.20 (0.02) | |
Full | 0.15 (0.02) | 0.09 (0.02) | 0.13 (0.01) | |
Marginal means | 0.21 (0.01) | 0.13 (0.01) | 0.18 (0.009) |
The figures of Table 4 are plotted in Figure 5 to give an impression of the relationship between the two factors and the probability of introducing a semantic distortion.

Mean probabilities of introducing semantic distortions. Error bars indicate one standard error. Figures at the bottom of the bars indicate the number of language problems contributing to the respective bar.
5.4. Participant perspective
In all previously presented analyses, one revision was treated as one case. We now want to adopt a perspective on the data that focuses on the participants. As a first measure, we propose counting the changed problems per participant. Given the total number of 35 language problems, the maximum value for one participant is also 35. We then average over the participants for each condition (our independent variable). If we do the same for the improvement annotations and combine both variables (our dependent variables) in one chart, we can visualise the average number of changes and improvements in the three revision aid conditions in a single chart (see Figure 6). The participants revised 12.9 (SE = 1.25) problems on average in the “only text” condition, 26.4 (SE = 1.26) problems in the “highlighted” condition and 31.2 (SE = 1.40) problems in the “full” condition.

Mean numbers of changed (lighter colours) and improved (darker colours) problems per participant. Error bars indicate one standard error. Figures at the bottom of the bars indicate the number of participants contributing to the respective bar.
The numbers of improvements are of course lower, because a participant can only improve a language problem when they actually change it. For improvements, the order of the three conditions is the same as for changes: In the “only text” condition, 7.58 (0.75) problems were improved on average. This figure more than doubles for the “highlighted” condition (17.0 problems, SE = 1.02) and further improves for the “full” condition (23.6 problems, SE = 1.11). All these differences are significant with p ≤ 0.012 or lower as indicated by two pairwise t-tests1 for the dependent variables change and improvement.
To further analyse the performance of each participant, we calculate a score by subtracting the number of inappropriate revisions (these are revisions that make a language problem worse or distort its meaning) from the number of improving revisions. Thus, a single participant can achieve a maximum score of 35, but only if they changed and improved every single language problem. The minimum score is -35, which indicates that a participant changed each language problem, but made each one worse. A score of zero represents several behavioural patterns: A participant achieves a score of zero if they did not attempt a single revision. But a participant who attempted twelve revisions, in which they improved the texts five times and did not change the quality of the text two times and made the problem worse five times, would also receive a score of zero. Our rationale behind this measure is to make participants (and especially groups of participants) comparable while, at the same time, considering all the revisions they made in the texts. Also, in contrast to the previous analyses, it penalises participants who attempted to change problems, but made the text worse.
Participants that received the texts without any revision aid only scored 3.62 (SE = 0.79) on average, whereas participants in the “highlighted” group achieved a mean score of 10.4 (1.13). Participants in the “full” condition achieved a mean score of 18.6 (1.06) and thus performed significantly better than both other groups. The “only text” condition was the only condition where participants scored less than zero points (one participant scored -4 points, two participants scored -3 points). Figure 7 visualises the participants’ scores. The distributions of scores in the three conditions overlap and there are a few outliers. However, there are clear differences in the central tendencies of the three groups: Especially the difference between the “only text” and the “full” condition is apparent.

Scores for all 78 participants in the three revision aid conditions. Each participant is represented by one point. When two or more participants in one group score identically, the points are aligned horizontally next to each other. For example, five participants scored 21 points in the “full” condition.
We took a closer look at the outlying participant scoring zero points in the “full” condition. This participant actually did not attempt to change a single language problem. Given that she or he also was one of the fastest participants to complete the experiment, we have to assume that this participant was not working on the texts at all, but only waited for the experimental session to end.
5.5. Measuring efficiency
The previous analyses suggest that participants in the “full” revision aid condition changed and improved more problems. Also, when penalising participants for inappropriate revisions, participants from the “full” revision aid group also outperformed the other groups in this regard.
We now devise an even stricter criterion. Lexicographic resources that accompany the texts surely provide a lot more information than the plain text or a text with highlighted segments. However, those resources also impose a greater workload on participants. The information presented in the resources has to be processed and transformed into revisions. The question remains whether the participants that were given lexicographic resources worked efficiently. We think this is a relevant question, because only if working with dictionaries is really efficient, they can be regarded as useful tools for daily practice. To answer this question, we extracted the time spent by each participant to revise the texts from the data sets provided by the questionnaire software (each screen page is timed separately by the software). Note that we only used the duration that the participants took to revise the texts. We excluded the time spent on all other pages of the questionnaire (e.g., the instructions) and the time the participants spent waiting for the other participants to finish the study.
In actual fact, participants in the “full” condition worked on the texts the longest (mean duration: 31.6 minutes, SE = 0.32), participants in the “highlighted” condition took 26.9 minutes on average (SE = 0.41) and participants from the “only text” group were the fastest with a mean duration of 24.8 minutes (SE = 0.35). If we want to measure the effectiveness of the participants, we can divide the previously introduced score measure by the time each participant took to revise the texts. This procedure yields a measure of “points per minute”.
Figure 8 visualises average points per minute for the three groups. Participants in the “only text” condition scored 0.19 points per minute (SE = 0.03), participants in the “highlighted” group scored more than double the amount of points per minute (mean: 0.46, SE = 0.06). Participants in the “full” condition scored an average of 0.62 (0.05) points per minute. All these differences are significant on a level of p ≤ 0.028 or lower as indicated by pairwise t-tests with the Holm correction applied.

Average points per minute for the three revision aid conditions. Points per minute are calculated by dividing the score for each participant by the time the same participant took to revise the texts. Error bars indicate standard errors.
6. Discussion
Given the converging evidence from the analyses of several variables, we can assume a hierarchy placing the “only text” condition at the bottom and the “full revision aid” condition at the top with regard to the participants’ performance. The “highlighted” condition lies in between and, in most cases, is significantly different from both other conditions. This hierarchy holds for all analysed variables. Language problems were changed most often and improved most often when lexicographic resources were provided. Revisions in this group also contained the least semantic distortions. The participant perspective yielded similar results. The participants who saw both highlights and the lexicographic resources changed and improved the most problems. Moreover, they scored highest and achieved the most points per minute. These results strongly corroborate the hypothesis that lexicographic resources actually aid in the revision process of texts. The statistical comparisons to the version in which only highlights were presented also show that highlighted text segments alone do indeed improve revision performance, but not to the same extent as additional lexicographic resources.
However, it also has to be noted that, although results considerably improve, the participants did not perform perfectly when provided with lexicographic resources. Even though we maximised the helpfulness of the resources by handpicking the relevant information for certain language problems, the participants still have to understand them and put them to good use when revising a text. We also cannot be sure that all the participants in our experiment were actually looking at the provided lexicographic resources. It could be that some of the participants (like the one who did not attempt one single revision) just ignored the resources and revised the texts using their own intuition – or did not revise them at all.
In the “full” condition, there were several participants who obviously referred to the resource but did not improve the language problem. This was true in 73 cases. In 24 of them, the participants changed something but neither improved nor worsened the problem. In the remaining 49 cases, however, the participants used information from the lexicographic resources, but made the language problem worse or distorted the meaning of the text. By investigating these cases, we can describe a typical pattern: Some participants were obviously not sensitive enough for the textual context of some problems. For example, if the lexicographic resources provided synonyms or typical collocations, some participants chose a synonym or a collocation from a resource that was inappropriate in this specific context. In some cases, the additional explanations provided by the lexicographic resources were obviously not read or not properly understood.
This suggests that the presence of lexicographic resources alone does not automatically lead to better revision results. The writers who use them also need to be competent enough to extract the useful information and incorporate it into their revisions. All that being said, these cases are still the minority. The overwhelming majority of 506 revisions (87.4 %) where the participants used some information found in the accompanying lexicographic resources was indeed successful.
7. Conclusions and outlook
We have presented a novel kind of dictionary usage study that allows us to analyse the effectiveness of using lexicographic tools for text optimisation. We analysed the revision results of the participants when the language problems were highlighted, when a relevant lexicographic resource was associated with each problem and when the students did not receive any revision aid at all. Our results suggest that (1) the detection of language problems plays a crucial role in making a successful revision of a text, that (2) providing lexicographic resources yields significantly better (i.e. more accurate and more efficient) optimisation results than only highlighting problems or providing no aid at all, and that (3) even hand-picking the lexicographic resources does not ensure that users resolve all language problems.
Though lexicographic resources help with the optimisation of a text, writers will often refrain from using them if they are not aware of a language problem. Therefore, we suggest a study into ways to assist writers in the detection of language problems even before the actual lookup process in a lexicographic resource begins. Automatic methods from NLP research can provide a key technology for this, if they manage to detect complex types of language problems, such as the ones considered in our study.
Based on our finding that the revision results were significantly better and more efficient when lexicographic resources were available, we propose researching automatic methods and novel access paths to lexicographic resources in order to assist writers in linking the particular language problem to the corresponding lexicographic information. While similar environments have been proposed for text comprehension tasks (cf. Seretan and Wehrli 2013 for an overview), our setting requires methods that adapt to the changing context whilst a user is writing. To ensure the success of lexicographic resources in the future, they have to be integrated in such environments. Or as Lew puts it:
„While it is fairly uncontroversial that people will continue to have lexical needs in natural communication as well as in more or less artificial learning contexts, it is much less certain that dictionaries will persist for much longer, at least in the form we know them today. Rather, it seems likely that dictionaries will increasingly become absorbed into more general digital tools designed to provide assistance with communication, expression, and information searching” (Lew 2015: 7).
In our experiment, we manually selected lexicographic resources that contain information which could help resolve a language problem. For practical applications, this poses two major challenges: selecting a suitable resource for a given problem and selecting the relevant parts of an article. The latter is particularly important, since we found that our participants were sometimes not careful enough in their interpretation of a lexicographic entry. Future research should also concentrate on whether there are suitable lexicographic resources for the language problems in focus. Also, if it comes to integrating lexicographic resources in intelligent writing environments, a requirement formulated by De Schryver (2009) comes to mind:
“What is needed is a dictionary […] that is truly adaptive – meaning that it will physically take on different forms in different situations; and one that would do so as intelligently as possible – meaning that it would have the ability to study and understand its user, and based on that to learn how to best present itself to that user” (De Schryver 2009: 586).
We envision an intelligent writing environment that assists users in writing and optimising texts using automatically generated hints to possible problems and corresponding lexicographic resources for learning how to resolve a problem. We argue in favour of assisting users with their writing rather than offering fully automatic corrections because fully automatic NLP systems still yield multiple errors that potentially remain undetected by users if they do not have an appropriate explanation or evidence. Instead, we consider it an important skill to be able to effectively use lexicographic resources. Existing systems neither integrate lexicographic resources on a large scale nor do they support intelligent access paths to the lexicographic entries in a similar way as simulated in our experiment. Based on our findings, we consider this an important strand of future research.
Apart from the topic of intelligent writing environments, it is a general result of our study that users profit from adequately chosen lexicographic resources when optimising texts. On the one hand, there seem to be language-related needs people have, and, on the other hand, there are good lexicographic resources that help solve these language-related problems. However, as all language teachers probably know, it is a challenging task to bring together these two sides. As a consequence, our study provides good arguments that it is also an important future concern to improve dictionary skills. “Human dictionary use involves two parties: the dictionary and the user. Therefore, successful lexicographic consultation is a two-way affair, and depends on two ingredients: how easy to use the dictionary is, and what skills related to dictionary use the user possesses” (Lew 2013: 16). The most challenging task in this regard is to find the appropriate context for teaching dictionary skills. One promising idea is to integrate dictionary skills in online learning courses (cf. Ranalli 2013) and to embed dictionary skills in the curriculum (Lew 2013: 29). An appropriately instructed user could then make maximum use of a writing environment with intelligently integrated lexicographic resources.
Note
Footnotes
We cannot use mixed-effects models here, because each participant corresponds to only one row in the dataset. Pairwise t-tests compare each group to every other group. Hence, p-values have to be corrected for multiple comparisons. Here, we employed a Holm correction, which is the default procedure in the R implementation of pairwise t-tests.
References
Appendix 1: Texts presented to the participants
7.1. Youth
The following text excerpt “Youth” was taken from the KoKo Corpus (Abel, Glaznieks, Nicolas, and Stemle, 2014). Language problems are underlined and consecutively numbered.
Der deutsche Schriftsteller und Essayist Hans Magnus Enzensberger sagte in einem Interview, dass die Jugend keine beneidenswerte Phase des Lebens sei. Doch ist das Erwachsenwerden nicht geradezu die wichtigste Phase eines Menschen (1)? Hierzu ein paar Punkte.
Ein sehr gutes Argument wie ich finde ist, dass man sich (2) in dieser sogenannten „Phase“ sein eigenes „Ich“ besser kennen lernt, denn man erlebt viel (3) egal (4) ob mit Freunden oder mit der Familie, ob schlechte oder positive Erlebnisse. Ein junger Mensch ist zwar labil, unsicher und macht jede Menge Dummheiten, trotzdem lernt derjenige (5) aus seinen eigenen (6) Fehlern und kann es (7) in der Zukunft besser machen (8).
Ich bin mir sogar (9) sicher, dass man die Phase auch Pubertät nennen kann.
Das Verhalten der Jugendlichen ist in dieser Zeit besonders impulsiv. Manche Jugendlichen (10) schlagen über die Stränge andere wiederum nicht. Jeder Mensch reagiert anders in diesem Moment (11). Zum Beispiel will der Großteil der Mädchen immer gut aussehen, die Trends der Mode nachgehen (12) und mit seinen Freundinnen über die aktuellsten Themen reden. Die Buben (13) wiederum wollen in den Diskotheken feiern bis der Arzt kommt (14). Die Aussage von Hans Magnus Enzensberger, man muss (15) froh sein, wenn man das überstanden hat (16), trifft bei solchen Jugendlichen sicherlich nicht zu.
Ich persönlich finde das Zitat vom (17) deutschen Schriftsteller und Essayisten Hans Magnus Enzensberger nicht für richtig (18). Die Pubertät bzw. die Entwicklungsphase ist eine sehr beneidenswerte. Man lernt in dieser Zeit soviel (19) Interessantes obwohl man keine Souveränität (20) besitzt.
7.2. Phraseology
The following text excerpt “Phraseology” was taken from the introduction of a term paper by a student of German philology at the University of Dortmund. Language problems are underlined and consecutively numbered.
In der vorliegenden Arbeit beschäftige ich mich mit dem Thema „Exemplarische Analysen von Phraseologismen in der Anzeigenwerbung“ unter Berücksichtigung der Text-Bild-Beziehung. Die konkrete Fragestellung beläuft sich (1) auf die Art und Weise, wie Phraseologismen aufgrund ihrer besonderen Merkmale genutzt werden können, um in Bezug auf (2) Anzeigenwerbungen (3) einen besonderen Beitrag hinsichtlich (4) einer (5) Text-Bild-Beziehung zu leisten. Es soll näher untersucht werden, inwiefern die Textelemente des (6) verwendeten Phraseologismus in der Anzeigenwerbung die visuellen Elemente aufgreifen oder vice versa (7), um die Anzeigenwerbung (8) mittels neuer Konnotationen zu erweitern bzw. Mehrdeutigkeiten hervorzubringen.
Um eine Klassifikation von Phraseologismen zu definieren (9), ist es meiner Meinung nach zunächst notwendig, einen kurzen Einblick über (10) den Forschungsbereich zu geben. Phraseologismen unterscheiden sich durch bestimmte Merkmale von freien Wortverbindungen. Zwecks (11) dieser Merkmale können die Mitglieder einer Sprachgemeinschaft Phraseologismen als solche erkennen. In der Literatur über dieses (12) Forschungsgebiet werden diese festen Wortschatzeinheiten u.a. als ‚Redewendungen‘, ‚Phraseme‘, ‚Idiome‘, ‚Mehrwortlexeme‘ u.dgl.m. bezeichnet. Im Rahmen dieser Arbeit werde ich mich auf die Termini ‚Phraseologismen‘, ‚Mehrwortlexeme‘, und ‚feste Wortverbindungen‘ beschränken, denen ich Synonymie zuspreche (13), um eventuell kontrahierende (14) Definitionen auszuschließen und zugleich ein besseres Leseverständnis zu erzeugen (15).