Cambridge ESOL exams and the CEFR
The empirical perspective
Shared understanding enabled the framework concept to function quite well without extensive underpinning from measurement theory and statistics. However, measurement theory became increasingly important as attempts were made to validate aspects of the CEFR empirically (North and Schneider 1998, North 2000a) and to link assessments to it (North 2000b). Claims of linkage or alignment need to be examined carefully; simply to assert that a test is aligned with a particular CEFR level does not necessarily make it so, even if assertion is based on intuitive or reasonable subjective judgement. To some extent, alignment can be achieved historically and conceptually, but empirical alignment requires more rigorous analytical approaches. Appropriate evidence needs to be accumulated and scrutinised.
The ALTE Can Do Project in 1998–2000 (Jones 2000, 2001, 2002) was one empirical approach used by Cambridge ESOL for aligning its original five levels with the six-level CEFR. Other empirical support for alignment comes from the item-banking methodology underpinning Cambridge’s approach to all test development and validation (Weir and Milanovic 2003). Latent trait methods have been used since the early 1990s to link the various Cambridge levels onto a common measurement scale using a range of quantitative approaches, e.g. IRT Rasch-based methodology.
More recently, Cambridge ESOL supported the authoring and piloting of the Council of Europe’s Manual Relating Language Examinations to the CEFR with its linking process based on three sets of procedures: specification; standardisation; empirical validation.
Specification procedures were used when the PET and KET tests were originally based upon Threshold and Waystage levels, and when the ALTE partners’ exams were aligned within the ALTE Framework. Extensive documentation for all the Cambridge ESOL exams (test specifications, item writer guidelines, examiner training materials, test handbooks and examination reports) specifies the content and purpose of existing/new exams with direct reference to the CEFR. In fact, the manual alignment procedures are embedded within the test development and validation cycle of Cambridge ESOL.
Cambridge helped develop the standardised materials needed to benchmark tests against CEFR levels; these include calibrated test items and tasks from General English Reading and Listening test item banks together with exemplar Writing test performances from writing examiner coordination packs and Speaking test performances from Oral Examiner standardisation materials at each CEFR level. The benchmarking materials, incorporating both classroom-based and test-based materials, are available from the Council of Europe on CD or DVD.
Empirical validation studies are a greater challenge sometimes requiring specialist expertise and resources; Cambridge ESOL is among a relatively small number of examination providers undertaking this sort of research, partly through our routine item-banking and test calibration methodology and also through instrumental research and case studies such as the Common Scale for Writing Project (Hawkey and Barker 2004).
Further information
References:
Hawkey, R & Barker, B (2004) Roger Hawkey & Fiona Barker, Developing a Common Scale for the Assessment of Writing, Assessing Writing 9/2
Jones, N (2000) Background to the validation of the ALTE ‘Can-do@ project and the revised Common European Framework, Research Notes 2, 11-13.
Jones, N (2001) The ALTE Can Do Project and the role of measurement in constructing a proficiency framework, Research Notes 5, 5–8.
Jones, N (2002) Relating the ALTE Framework to the Common European Framework of Reference, in Council of Europe, Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Case Studies, Strasbourg: Council of Europe Publishing, 167–183.
North, B. (2000a) The development of a common framework scale of language proficiency, New York: Peter Lang.
North, B. (2000b) Linking Language Assessments: an example in a low-stakes context, System 28, 555-577.
North, B and Schneider, G (1998) Scaling Descriptors for Language Proficiency Scales, Language Testing 15 (2), 217–262.
Weir, C J and Milanovic, M (Eds) (2003) Continuity and Innovation: The History of the CPE 1913-2002, Studies in Language Testing 15, Cambridge: Cambridge University Press/UCLES.

