Language testing does more harm than good
Richard Smith University of Warwick, UK
Anthony Green University of Bedfordshire, UK
Chair: Graham Hall, ELT Journal Editor
Richard Smith, proposing the motion
I hope to give voice to some of the concerns of teachers, including my students at Warwick from many countries, who tell me there is a great need for this debate. They have shared with me disturbing stories of:
- national, institution-wide and publisher-produced achievement tests increasingly dominating and constraining the teaching of English in schools;
- annual suicides and other manifestations of severe stress in the run-up to / following failure in high stakes school-leaving or university-entrance examinations in which English plays a central role;
- students and others excluded from academic or professional advancement due solely to failure in English examinations irrelevant to their area of study or work;
- increasingly prevalent uses of standardized test results to identify and justify sanctions against ‘non-performing’ English teachers;
- testing agencies seeking to test children’s English at ever earlier ages;
- English test results legitimizing restrictions on travel and migration;
- the mushrooming of English test preparation centres which profit from the poor as much as those better able to purchase test success.
These examples focus specifically on English language testing and on the relatively large-scale, high-stakes achievement and proficiency tests which my students describe as problematic and which I think we mainly think of when we hear the word ‘testing’. Classroom-based language assessment is a different matter, as Green (2014: 172) clarifies.
Only the first example above relates to negative washback on teaching and learning – the area of potential harm which researchers have mostly acknowledged. The other examples concern the psychological and social impact of English language tests (see Taylor (2005: 154) for this distinction). With regard to washback, here are just two further critical points. First, as UK- or US-based test producers increasingly succeed in selling to education authorities worldwide, there is a Trojan Horse effect – the utilitarian goal of ‘proficiency’ comes to predominate at the expense of other, less obviously testable but important educational values, for example, intercultural understanding, language learner autonomy and literary appreciation (cf. Paran and Sercu 2010). Secondly, these global tests – however technologically innovative, scientifically based and ‘adaptive’ they may seem to be (Kerr 2014) – are acting increasingly as a conservative brake on attempts to innovate away from native speaker norms in favour of more flexible, dynamic and localized conceptions of language-in-use.
However, the most disturbing scenarios above concern not ‘washback’ effects but the negative impact which language testing can have on ‘on educational systems, and on society more widely’, on ‘career or life chances of individual test takers’ (Taylor ibid.), and, indeed, on individuals’ psychological well-being. It is high time for the profession to acknowledge and seek to resist these and other increasingly invasive effects – indeed, to begin to stand up to the increasing power (Shohamy 2001) of English testing worldwide.
Anthony Green, opposing the motion
Measurement plays a very important part in our way of life. It is difficult to imagine being without such tools as watches, thermometers, scales, rulers and GPS systems. We take these tools for granted in planning and coordinating what we do, carrying out our jobs, preparing our food, travelling around and so on. Far from being a ‘necessary evil’, language testing is another indispensable measurement tool; one that can serve us by helping to make learning more effective. Tests are useful to us as learners because they can reveal gaps between what we can do with a language and what we want to be able to do; they can help us to set goals and gauge our progress; they improve memory, encourage practice and can help us to transform it from unthinking mimicry into genuine learning. Tests are useful to us as teachers because they can provide hard evidence of whether or not our lessons have been effective. They can show us what students have picked up (and what they haven’t) and so help to guide what we do in the classroom.
Making good measurement tools and interpreting the information they give us requires skill. Given that finding out what learners can do with a language is so important, it is deplorable that so little attention is given to this essential theme in teacher training. All too often, testing is relegated to the end of a training course, or sidelined as an optional unit. As a result of this neglect, it is not surprising that teachers generally lack confidence when it comes to designing tests and that so many teacher-made tests fail to provide valuable evidence about student learning. Of course, Richard Smith’s objections are not about the overwhelming majority of testing, which occurs in the classroom, but about the relatively small proportion of testing that falls outside the control of teachers: the large-scale examinations conducted by national and international agencies. Teachers tend to dislike these tests, accusing them of causing such harms as
- encouraging learning by memorisation
- narrowing the content of classes so that only what is tested gets taught
- fostering anxiety and promoting competition and rivalry between learners
It seems undeniable that these effects can sometimes occur where tests are used, but it is less clear whether the tests themselves are to blame. Where international tests are used, not all countries are equally affected by these negative effects and many tests seem to have very little influence on teaching and learning. If a class involves memorisation, is it really the test that is at fault or our unquestioning assumptions about what is involved in efficient test preparation? Where is the proof that endless repetition of dull test-like activities really results in better test scores? If test takers suffer debilitating anxiety, is the test the only source? Or does pressure to succeed from parents, teachers or others also play a part? Tests make easy scapegoats; in reality the negative effects ascribed to testing have complex causes.
Comments from the floor
A range of perspectives were then put forward from the floor, generally referring to large-scale, high stakes international language tests. Speakers who supported the motion often pointed to the ‘psychological terror’ for students of testing and exams, and noted that the growth of language testing was not isolated from broader educational trends to ‘test, test and test again’. Many of these speakers acknowledged that testing has a place within ELT, but called for caution about when and why we test. Several also spoke of testing as an ‘industry’, and argued that test designers too often divorced themselves from the political consequences of their work. Indeed, arguments in support of the motion reflected on the purpose of ELT and education in general, asking whether education is ‘about a test score, or about growth and development?’
Many speakers, however, identified nuances within the debate, suggesting that the key issue was not language tests per se, but the use to which tests were put; links were made between testing and wider political concerns such as immigration and citizenship. Consequently, the need for teachers themselves to take more responsibility for understanding language tests and testing was also noted, in order for them to ‘join the conversation’ about how we might evaluate our learners as effectively as possible.
Closing statement by Richard Smith
As Spolsky (1995: 1) says, so long as English language testing is confined to ‘helping students learn or to determining the qualifications of individuals seeking employment’ the ends may justify the means, but ‘testing has been exploited also as a method of control and power – as a way to select, to motivate, to punish’ (ibid.). In response to Anthony Green I would say that the harmful effects and ‘abuses’ of testing are first and foremost something for testers to account for and attempt to mitigate, not end-users. Tests are not designed in the abstract but for particular purposes and markets, and they have predictable, or at least measurable consequences.
At the same time, the harmful effects of language testing are everybody’s business, even if we cannot see easily how to bring about change, or even engage in dialogue with testing specialists. This sense of inadequacy is, in fact, a further consequence of testing itself – since perhaps the overall most harmful and insidious effect of all standardized testing is to disempower teachers and students, to remove agency from them and make them feel inadequate in the face of a situation which is beyond their control. To resist the encroaching power of tests and not accept them as a ‘necessary evil’ is therefore an increasingly difficult but an increasingly urgent task.
Closing statement by Anthony Green
The motion implies that a test-free world would be a better place, but ridding ourselves of tests would not eliminate the underlying problems. In fact, a world without formal testing is not so very difficult to imagine. For most of us, public examinations are a relatively new development. Not long ago (and often even today), people looking for opportunities such as university study or professional employment needed the right personal connections (or needed to have enough money to pay bribes to the right people). Well-made tests give us a fairer way of distributing rewards than the old ways of nepotism, cronyism, prejudice and corruption. Although language testing is clearly preferable to the alternatives, I would have to agree that many of the tests in widespread use today are not good enough and that test results are often used inappropriately. We can - and must - do better. As professional language educators, we all share a responsibility to improve this situation. We need to get involved with language testing – studying how tests are made, why they are made that way, whose interests they serve and what social purposes they carry out. I believe we can become better language learners and teachers by better harnessing the power of tests in our classrooms and I believe we can improve the quality and value of large-scale tests by taking a more critical interest in how they are made
- Green, A. 2014. Exploring Language Assessment and Testing. Abingdon: Routledge.
- Kerr, P. 2014. A Short Guide to Adaptive Learning in English Language Teaching. the-round.com
- Paran, A and Sercu, L. (eds). 2010. Testing the Untestable in Language Education. Bristol: Multilingual Matters.
- Shohamy, E. 2001. The Power of Tests: A Critical Perspective on the Uses of Language Tests. Harlow: Longman.
- Spolsky, B. 1995. Measured Words. Oxford: Oxford University Press.
- Taylor, L. 2005. ‘Washback and impact’ (Key Concepts in ELT). ELT Journal 59/2: 154–155.