Contemporary Second Language Assessment

Contemporary Second Language Assessment In their introduction to Contemporary Second Language Assessment, Jayanti Banerjee and Dina Tsagari suggest that this book is intended both as a ‘one volume reference’ and as a ‘primary source of enrichment material’ for a wide readership including, of particular relevance to readers of ELT Journal, language teachers and teacher trainers. Available in hardback and ebook editions and priced at almost £130, this would be a substantial investment for individual teachers or students. Because it enters an already rather crowded field of collections on language assessment issues, the book needs to work hard to justify its claims against a number of other impressive titles published within the past five years. These range from the monumental, multi-volume Companion to Language Assessment (Kunnan 2013) to the Sage/ILTA award-winning Routledge Handbook of Language Testing (Fulcher and Davidson 2013) and the more introductory Cambridge Guide to Second Language Assessment (Coombe, Davidson, O’Sullivan, and Stoynoff, 2012), not to mention another volume also edited by Tsagari and Banerjee (2016), the Handbook of Second Language Assessment. In practice, unlike the other titles listed above, the book does not really set out to provide a one-stop-shop introduction or a comprehensive overview of the field. It does not have the exhaustive coverage of the Wiley Companion and spurns the relatively gentle, generalist tone of the Cambridge Guide. Instead, as the editors explain in their introduction, this is a collection of research case studies that address perennial questions in a variety of settings, illustrating a range of methods and approaches. However, with just 15 chapters spread over 300 pages, the authors do have plenty of scope to report their research in some depth and to provide rather more background than might be expected of an article in an academic journal. In practice, some of the chapters are more accessible than others, providing more background for the non-specialist reader. Weigle and Goodwin (Chapter 10), for example, include a helpful overview of the role of corpora in language assessment before presenting their research study. Others make fewer concessions to the uninitiated, with some rather dense presentations of statistical results. While the introduction by the editors provides informative summaries of the individual chapters, I would strongly recommend the reader approach this volume from the end. The excellent final chapter by Sauli Takala, Gudrun Erickson, Neus Figueras, and Jan-Eric Gustafsson deftly lays out the issues picked up by the other authors, putting them into historical perspective. It is a missed opportunity on the part of the editors that their insightful discussions of matters such as test constructs, impact, the Common European Framework of Reference for Languages, and standard setting are not cross-referenced to the earlier chapters, which expand on these themes. Takala and his co-authors depict the contexts for and purposes of assessment as a series of concentric circles. At the centre is the individual language learner, engaged in self-assessment, with successive layers representing the teacher and classroom; school, district and state; national educational system; and international comparisons. With this range of purposes in mind, the focus of the book seems rather narrower than the title might suggest. In spite of growing research interest in assessment in the classroom, the book concentrates only on the outer circles of Takala et al.’s diagram: on the world of large-scale testing. Testing at the interface between secondary schooling and university is a particular preoccupation. Most of the chapters involve state, national or international tests; two discuss tests used by individual universities; none concentrates on assessment by teachers in the classroom or self-assessment by learners. Similar to their De Gruyter Mouton Handbook, the editors choose to organize this book around three broad themes. The first part, titled ‘Theoretical Considerations’, looks at questions of test design from the testing agencies’ perspective. It considers how these agencies analyse, justify and communicate the qualities of their tests. The second, ‘Specific Language Aspects’, presents case studies in the testing of the traditional four skills as well as the novel area of second-language pragmatics. The final part, ‘Issues in Second Language Assessment’, turns towards questions of fairness as well as ongoing and future developments, notably the growing role of information technology in language testing. The opening chapter by Lin Gu shows how language-testing researchers look for patterns in test results to confirm the theories of language ability that inform test design. In this case, Gu grapples with the role of contexts in shaping language use. The material on the TOEFL iBT® can be divided into instructional tasks (tasks based on academic study settings such as lectures and essays) and non-instructional (concerned with university life around campus). For example, in the listening section of the test form studied by Gu, one of the six tasks involved listening to a lecture on art history (instructional), while another involved a conversation between a student and an employee of the university housing office (non-instructional). Gu found that the test results were better explained by a statistical model that labelled test tasks both by skill (listening, reading, writing and speaking) and by context (instructional and non-instructional) rather than by models that labelled tasks only by skills or only by context. This finding supports the test developers’ assumption that both skill and context contribute to language use and should be reflected in test performance. Both Elvis Wagner (Chapter 6) and Carsten Roever (Chapter 9) approach similar issues to Gu, but from a different perspective, judging the selection of test material in relation to theories of language use. Wagner observes that most of the listening sections in major tests of English for Academic Purposes used in admissions to North American universities (including TOEFL iBT®) involve the use of scripted texts. These lack many of the features of the spontaneous speech that students will encounter in university life (such as filled pauses, hesitation phenomena and back-channelling): features that are known to impact on comprehension. In this sense, the test material does not fully reflect the language awareness that students will need in the university context. Roever raises further questions about the coverage of tests such as IELTSTM, TOEFL iBT® and PTE AcademicTM. He notes that the models of communicative language ability which inform the design of these and many other language tests include a component termed ‘pragmatic competence’, but that the tests do not include items specifically targeting pragmatics. Results from his experimental test suggest that it is possible to measure learners’ pragmatic abilities, but he acknowledges that much work remains to be done to establish what this might contribute to the overall value of test results. Language-testing organizations need to consider these kinds of challenges when justifying the use of their tests and working to improve their products. They need to tackle such questions as why the test is suitable for its purpose and how scores should be interpreted by those who use them to make decisions. A number of the chapters provide insider perspectives on how this work is done. Elaine Boyd and Cathie Taylor (Chapter 2) describe how Trinity College, London has used Weir’s (2005) sociocognitive test validation framework to collect evidence for the validity of their Graded Exams in Spoken English, drawing on insights from a panel of experts in pedagogy, second-language acquisition and language testing. Chapter 4, by John de Jong and Ying Zheng, explains the part that the Common European Framework of Reference (CEFR) played in the design and refinement of the PTE AcademicTM, and show how scores on the test were subsequently related to CEFR levels, drawing on evidence from a variety of sources. In Chapter 13, Jennifer Norton and Carsten Wilmes describe how the qualities of a test used with English-language learners in US schools (ACCESS for ELLs) were investigated as it was adapted for online delivery. In this case, the test developers used cognitive labs (interviewing students as they engaged with test tasks) and other forms of evidence to explore how the learners coped with the material. They explain how the test developers responded to the issues that emerged to improve the quality of the test. On a smaller scale, Valerie Meier, Jonathan Trace and Gerriet Janssen in Chapter 8 exemplify how a rating scale used for scoring a test of extensive writing can be successfully improved for use in a particular local context (a university in Colombia). In this case, revisions were based on insights from the examiners who had been using the scale. The comparison of the results achieved before and after the revision of the scale offer an impressive demonstration of what can be accomplished within an institutional language programme. In Chapter 6, Ari Huhta, Charles Alderson, Lea Nieminen and Riikka Ullakonoja investigate foreign language reading abilities and the role played in these by such factors as parental education, home environment, use of foreign languages and attitude to reading in a foreign language. The study involves both Finnish children learning to read in English and the children of Russian immigrants to Finland learning to read in Finnish. The picture that emerges is complex and reveals some fascinating differences between the two groups. For example, the younger the Finnish children were when they first learned to read in Finnish, the better their English reading scores were likely to be. However, perhaps reflecting something about their families’ use of languages at home, children who learned to read in Russian at an early age were less likely to perform well on tests of Finnish reading than those who learned when they were older. Norman Verhelst, Jayanti Banerjee and Patrick McLain (Chapter 12) discuss the challenges involved in trying to establish whether some test material may favour particular groups in society. They present a new statistical approach to detecting such issues. It emerged from their study that younger test takers (under 17) did not perform as well as their older counterparts on MET® test tasks that involved language associated with the workplace. The authors conclude that a new version of the MET® may be required: one that is specifically designed for teenage learners. Another tricky question, that of how users of test scores can decide ‘how much is good enough’ when it comes to test scores, is the topic of Chapter 11 by Vivien Berry and Barry O’Sullivan. They report on the procedures they followed to determine the IELTSTM scores that should be required of international medical graduates in order for them to be allowed to practice in the UK. The chapters by Sarah Cushing Weigle and Sarah Goodwin (Chapter 10) and by Haiying Li, Keith Shubeck and Arthur Graesser (Chapter 14) touch on applications of new technology in language assessment. Weigle and Goodwin illustrate the new insights that large-scale language corpora, made possible through information technology, can provide to help test developers to relate the language used in tests to the language actually used in real-world contexts. Li, Shubeck and Graesser describe some of the benefits of speech recognition and automated text analysis tools and the new opportunities these open up for automated scoring systems. They also describe innovative automated tutoring systems that use assessment to support individuated learning (although it is a pity that the system described is not a language learning tool). Two chapters explore how teachers and learners respond to tests. Doris Froetscher (Chapter 3) concentrates on how an important national examination (the Austrian Matura administered at the end of secondary schooling) affects the tests that teachers make for use in the classroom. While changes to the national exams do appear to have brought positive improvements in the tests used by teachers in class, they may also be restricting the variety of methods used and encouraging teachers to become overreliant on the published materials. Contrasting with the large-scale studies presented in the other chapters, Zhengdong Gan (Chapter 7) explores how two language learners, training as teachers of English, responded to their failure on an oral assessment (the Language Proficiency Assessment for Teachers) in Hong Kong. The study traces how the two used their experience to diagnose their own strengths and weaknesses in carrying out test tasks and to adopt strategies that they believed would bring them success at their next attempt. The two trainees learned to ‘play the game’ by coming to better understand the demands of the test, but also displayed a sophisticated understanding of what the test could reveal about their spoken language skills. Taken as a whole, this book may not provide an all-encompassing picture of the field, and more might have been done to bring together the common threads in the various chapters. Nonetheless, it certainly gives the reader a taste of a wide variety of language testing research projects. It includes large-scale projects carried out by recognized figures and well-resourced organizations alongside relatively small-scale doctoral studies from emerging researchers. Each of the individual chapters would offer useful insights to researchers engaged in similar projects and would offer interesting material for discussion on graduate courses in language assessment. Anthony Green is Professor in Language Assessment and Director of the Centre for Research in English Language Learning and Assessment at the University of Bedfordshire. He has written and published widely on language assessment: recent books include Exploring Language Assessment and Testing (Routledge 2014) and Language Functions Revisited (Cambridge University Press 2012). He is Immediate Past President of the International Language Testing Association (ILTA) and an Expert Member of the European Association for Language Testing and Assessment (EALTA). References Fulcher, G. and F. Davidson (eds). 2013. The Routledge Handbook of Language Testing . Abingdon: Routledge. Kunnan, A. (ed.). 2013. The Companion to Language Assessment . Hoboken, NJ: Wiley-Blackwell. Google Scholar CrossRef Search ADS   Tsagari, D. and J. Banerjee (eds.). 2016. Handbook of Second Language Assessment . Boston, MA: De Gruyter Mouton. Google Scholar CrossRef Search ADS   Coombe, C., P. Davidson, B. O’Sullivan, and S. Stoynoff (eds.). 2012. The Cambridge Guide to Second Language Assessment . Cambridge: Cambridge University Press. © The Author(s) 2018. Published by Oxford University Press; all rights reserved. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ELT Journal Oxford University Press

Contemporary Second Language Assessment

Loading next page...
 
/lp/ou_press/contemporary-second-language-assessment-wcf7WFvqe3
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press; all rights reserved.
ISSN
0951-0893
eISSN
1477-4526
D.O.I.
10.1093/elt/ccx057
Publisher site
See Article on Publisher Site

Abstract

In their introduction to Contemporary Second Language Assessment, Jayanti Banerjee and Dina Tsagari suggest that this book is intended both as a ‘one volume reference’ and as a ‘primary source of enrichment material’ for a wide readership including, of particular relevance to readers of ELT Journal, language teachers and teacher trainers. Available in hardback and ebook editions and priced at almost £130, this would be a substantial investment for individual teachers or students. Because it enters an already rather crowded field of collections on language assessment issues, the book needs to work hard to justify its claims against a number of other impressive titles published within the past five years. These range from the monumental, multi-volume Companion to Language Assessment (Kunnan 2013) to the Sage/ILTA award-winning Routledge Handbook of Language Testing (Fulcher and Davidson 2013) and the more introductory Cambridge Guide to Second Language Assessment (Coombe, Davidson, O’Sullivan, and Stoynoff, 2012), not to mention another volume also edited by Tsagari and Banerjee (2016), the Handbook of Second Language Assessment. In practice, unlike the other titles listed above, the book does not really set out to provide a one-stop-shop introduction or a comprehensive overview of the field. It does not have the exhaustive coverage of the Wiley Companion and spurns the relatively gentle, generalist tone of the Cambridge Guide. Instead, as the editors explain in their introduction, this is a collection of research case studies that address perennial questions in a variety of settings, illustrating a range of methods and approaches. However, with just 15 chapters spread over 300 pages, the authors do have plenty of scope to report their research in some depth and to provide rather more background than might be expected of an article in an academic journal. In practice, some of the chapters are more accessible than others, providing more background for the non-specialist reader. Weigle and Goodwin (Chapter 10), for example, include a helpful overview of the role of corpora in language assessment before presenting their research study. Others make fewer concessions to the uninitiated, with some rather dense presentations of statistical results. While the introduction by the editors provides informative summaries of the individual chapters, I would strongly recommend the reader approach this volume from the end. The excellent final chapter by Sauli Takala, Gudrun Erickson, Neus Figueras, and Jan-Eric Gustafsson deftly lays out the issues picked up by the other authors, putting them into historical perspective. It is a missed opportunity on the part of the editors that their insightful discussions of matters such as test constructs, impact, the Common European Framework of Reference for Languages, and standard setting are not cross-referenced to the earlier chapters, which expand on these themes. Takala and his co-authors depict the contexts for and purposes of assessment as a series of concentric circles. At the centre is the individual language learner, engaged in self-assessment, with successive layers representing the teacher and classroom; school, district and state; national educational system; and international comparisons. With this range of purposes in mind, the focus of the book seems rather narrower than the title might suggest. In spite of growing research interest in assessment in the classroom, the book concentrates only on the outer circles of Takala et al.’s diagram: on the world of large-scale testing. Testing at the interface between secondary schooling and university is a particular preoccupation. Most of the chapters involve state, national or international tests; two discuss tests used by individual universities; none concentrates on assessment by teachers in the classroom or self-assessment by learners. Similar to their De Gruyter Mouton Handbook, the editors choose to organize this book around three broad themes. The first part, titled ‘Theoretical Considerations’, looks at questions of test design from the testing agencies’ perspective. It considers how these agencies analyse, justify and communicate the qualities of their tests. The second, ‘Specific Language Aspects’, presents case studies in the testing of the traditional four skills as well as the novel area of second-language pragmatics. The final part, ‘Issues in Second Language Assessment’, turns towards questions of fairness as well as ongoing and future developments, notably the growing role of information technology in language testing. The opening chapter by Lin Gu shows how language-testing researchers look for patterns in test results to confirm the theories of language ability that inform test design. In this case, Gu grapples with the role of contexts in shaping language use. The material on the TOEFL iBT® can be divided into instructional tasks (tasks based on academic study settings such as lectures and essays) and non-instructional (concerned with university life around campus). For example, in the listening section of the test form studied by Gu, one of the six tasks involved listening to a lecture on art history (instructional), while another involved a conversation between a student and an employee of the university housing office (non-instructional). Gu found that the test results were better explained by a statistical model that labelled test tasks both by skill (listening, reading, writing and speaking) and by context (instructional and non-instructional) rather than by models that labelled tasks only by skills or only by context. This finding supports the test developers’ assumption that both skill and context contribute to language use and should be reflected in test performance. Both Elvis Wagner (Chapter 6) and Carsten Roever (Chapter 9) approach similar issues to Gu, but from a different perspective, judging the selection of test material in relation to theories of language use. Wagner observes that most of the listening sections in major tests of English for Academic Purposes used in admissions to North American universities (including TOEFL iBT®) involve the use of scripted texts. These lack many of the features of the spontaneous speech that students will encounter in university life (such as filled pauses, hesitation phenomena and back-channelling): features that are known to impact on comprehension. In this sense, the test material does not fully reflect the language awareness that students will need in the university context. Roever raises further questions about the coverage of tests such as IELTSTM, TOEFL iBT® and PTE AcademicTM. He notes that the models of communicative language ability which inform the design of these and many other language tests include a component termed ‘pragmatic competence’, but that the tests do not include items specifically targeting pragmatics. Results from his experimental test suggest that it is possible to measure learners’ pragmatic abilities, but he acknowledges that much work remains to be done to establish what this might contribute to the overall value of test results. Language-testing organizations need to consider these kinds of challenges when justifying the use of their tests and working to improve their products. They need to tackle such questions as why the test is suitable for its purpose and how scores should be interpreted by those who use them to make decisions. A number of the chapters provide insider perspectives on how this work is done. Elaine Boyd and Cathie Taylor (Chapter 2) describe how Trinity College, London has used Weir’s (2005) sociocognitive test validation framework to collect evidence for the validity of their Graded Exams in Spoken English, drawing on insights from a panel of experts in pedagogy, second-language acquisition and language testing. Chapter 4, by John de Jong and Ying Zheng, explains the part that the Common European Framework of Reference (CEFR) played in the design and refinement of the PTE AcademicTM, and show how scores on the test were subsequently related to CEFR levels, drawing on evidence from a variety of sources. In Chapter 13, Jennifer Norton and Carsten Wilmes describe how the qualities of a test used with English-language learners in US schools (ACCESS for ELLs) were investigated as it was adapted for online delivery. In this case, the test developers used cognitive labs (interviewing students as they engaged with test tasks) and other forms of evidence to explore how the learners coped with the material. They explain how the test developers responded to the issues that emerged to improve the quality of the test. On a smaller scale, Valerie Meier, Jonathan Trace and Gerriet Janssen in Chapter 8 exemplify how a rating scale used for scoring a test of extensive writing can be successfully improved for use in a particular local context (a university in Colombia). In this case, revisions were based on insights from the examiners who had been using the scale. The comparison of the results achieved before and after the revision of the scale offer an impressive demonstration of what can be accomplished within an institutional language programme. In Chapter 6, Ari Huhta, Charles Alderson, Lea Nieminen and Riikka Ullakonoja investigate foreign language reading abilities and the role played in these by such factors as parental education, home environment, use of foreign languages and attitude to reading in a foreign language. The study involves both Finnish children learning to read in English and the children of Russian immigrants to Finland learning to read in Finnish. The picture that emerges is complex and reveals some fascinating differences between the two groups. For example, the younger the Finnish children were when they first learned to read in Finnish, the better their English reading scores were likely to be. However, perhaps reflecting something about their families’ use of languages at home, children who learned to read in Russian at an early age were less likely to perform well on tests of Finnish reading than those who learned when they were older. Norman Verhelst, Jayanti Banerjee and Patrick McLain (Chapter 12) discuss the challenges involved in trying to establish whether some test material may favour particular groups in society. They present a new statistical approach to detecting such issues. It emerged from their study that younger test takers (under 17) did not perform as well as their older counterparts on MET® test tasks that involved language associated with the workplace. The authors conclude that a new version of the MET® may be required: one that is specifically designed for teenage learners. Another tricky question, that of how users of test scores can decide ‘how much is good enough’ when it comes to test scores, is the topic of Chapter 11 by Vivien Berry and Barry O’Sullivan. They report on the procedures they followed to determine the IELTSTM scores that should be required of international medical graduates in order for them to be allowed to practice in the UK. The chapters by Sarah Cushing Weigle and Sarah Goodwin (Chapter 10) and by Haiying Li, Keith Shubeck and Arthur Graesser (Chapter 14) touch on applications of new technology in language assessment. Weigle and Goodwin illustrate the new insights that large-scale language corpora, made possible through information technology, can provide to help test developers to relate the language used in tests to the language actually used in real-world contexts. Li, Shubeck and Graesser describe some of the benefits of speech recognition and automated text analysis tools and the new opportunities these open up for automated scoring systems. They also describe innovative automated tutoring systems that use assessment to support individuated learning (although it is a pity that the system described is not a language learning tool). Two chapters explore how teachers and learners respond to tests. Doris Froetscher (Chapter 3) concentrates on how an important national examination (the Austrian Matura administered at the end of secondary schooling) affects the tests that teachers make for use in the classroom. While changes to the national exams do appear to have brought positive improvements in the tests used by teachers in class, they may also be restricting the variety of methods used and encouraging teachers to become overreliant on the published materials. Contrasting with the large-scale studies presented in the other chapters, Zhengdong Gan (Chapter 7) explores how two language learners, training as teachers of English, responded to their failure on an oral assessment (the Language Proficiency Assessment for Teachers) in Hong Kong. The study traces how the two used their experience to diagnose their own strengths and weaknesses in carrying out test tasks and to adopt strategies that they believed would bring them success at their next attempt. The two trainees learned to ‘play the game’ by coming to better understand the demands of the test, but also displayed a sophisticated understanding of what the test could reveal about their spoken language skills. Taken as a whole, this book may not provide an all-encompassing picture of the field, and more might have been done to bring together the common threads in the various chapters. Nonetheless, it certainly gives the reader a taste of a wide variety of language testing research projects. It includes large-scale projects carried out by recognized figures and well-resourced organizations alongside relatively small-scale doctoral studies from emerging researchers. Each of the individual chapters would offer useful insights to researchers engaged in similar projects and would offer interesting material for discussion on graduate courses in language assessment. Anthony Green is Professor in Language Assessment and Director of the Centre for Research in English Language Learning and Assessment at the University of Bedfordshire. He has written and published widely on language assessment: recent books include Exploring Language Assessment and Testing (Routledge 2014) and Language Functions Revisited (Cambridge University Press 2012). He is Immediate Past President of the International Language Testing Association (ILTA) and an Expert Member of the European Association for Language Testing and Assessment (EALTA). References Fulcher, G. and F. Davidson (eds). 2013. The Routledge Handbook of Language Testing . Abingdon: Routledge. Kunnan, A. (ed.). 2013. The Companion to Language Assessment . Hoboken, NJ: Wiley-Blackwell. Google Scholar CrossRef Search ADS   Tsagari, D. and J. Banerjee (eds.). 2016. Handbook of Second Language Assessment . Boston, MA: De Gruyter Mouton. Google Scholar CrossRef Search ADS   Coombe, C., P. Davidson, B. O’Sullivan, and S. Stoynoff (eds.). 2012. The Cambridge Guide to Second Language Assessment . Cambridge: Cambridge University Press. © The Author(s) 2018. Published by Oxford University Press; all rights reserved.

Journal

ELT JournalOxford University Press

Published: Jan 1, 2018

There are no references for this article.

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off