Experts in Language Assessment

Cambridge ESOL and Fairness

 

A test can only be considered good if it is also fair: this may seem self-evident, but defining what fairness is and how it should operate is not simple. For example:

  • A test that discriminates against sub-groups of people, because of their gender, cultural background or some other reason, is not fair to those sub-groups
  • A test that gives an advantage to a particular sub-group of people, for whatever reason, is not fair to other candidates
  • A test that fails to accurately discriminate against people who do not meet the standard of ability being tested is not fair to those who rely on the test as an indicator of that ability (such as employers, or universities)
  • A test that does not perform consistently is not fair to anyone who takes it or relies on its results

So, how test fairness is perceived will largely depend on who you are and what use you will put the test to.

Although test fairness has become an important issue in language assessment in recent years, it has been an integral part of Cambridge ESOL's approach since we first offered the Certificate of Proficiency in English in 1913. Through the years we have used the latest research and technology to develop our approach to fairness. We outline this approach below.

What is the Cambridge ESOL approach to test fairness?

We are a not-for-profit organisation: our commitment to fairness in educational testing and assessment is part of our larger concern for education and ethical behaviour within society. Our approach to test fairness, therefore, looks at the entire experience and consequences of testing for individuals, groups, and society as a whole.

We believe a fair test is one in which the ability being tested (in the assessment field this is called the 'test construct') is the primary focus and where all irrelevant barriers to candidate performance have been removed.

Why is test fairness important?

Tests are often used to make important decisions and can have serious consequences for an individual's career or life chances. They can affect what happens in teaching and learning at classroom or school level and can also influence regional or national educational systems. They can affect employers’ selection and recruitment of staff and impact on civic life in areas such as immigration or access to university education. In light of this, test producers must be sensitive to the consequential issues surrounding their tests, monitoring them against accepted professional standards and ethical codes.

Which aspects of tests need to be considered in relation to fairness?

Fairness touches upon all aspects of a test:

  • construction – how it is conceived and designed
  • administration – how the test is delivered and conducted
  • evaluation – how the candidate’s performance is marked

In relation to test construction, for example, the fairness of the test can be affected by who designs the test specifications and what level of expertise they bring to that task; or through the choice of the language domain (e.g. general or business English, low or high level proficiency) and the tasks or items that are chosen to represent it. The use of technology is also becoming an important fairness issue in test construction: great care has to be taken to ensure that the technology fits the test and the candidates, and that the test and the candidates are not being made to fit the available technology.

One example of this is the use of computer based Speaking tests, where the candidate talks to a machine and the performance is rated by the software. It is the view of Cambridge ESOL, that while this technology is suitable to give a quick and affordable 'snapshot' of language ability, it does not have the depth and richness of information a face-to-face Speaking test can give. Because of this, we believe that fairness to both candidates and the organisations that rely on these exams is largely about matching the right test to the purpose. Using computer-based Speaking tests inappropriately – such as for assessing ability to use English in demanding environments such as university courses, can lead to difficulties. Cambridge ESOL's high-stakes exam (tests used for access to professional groups, immigration or university acceptance) all use face-to-face speaking tests and human raters.

Impact (the effect a test has on teaching and the wider community) is another important aspect of test fairness that has implications for test construction. It is inevitable that teachers will adapt their course content to prepare students to pass the test they are studying towards. If the test construct is too narrow, then students can pass their test, but not have sufficient knowledge and ability for real communication – which is unfair to the students and the organisations that might rely on their exam certificates. The test construct, therefore must be sufficiently broad and carefully specified against known data of how language is used to communicate in the real world (such as written and spoken corpora – data banks of real language usage). Rigorous procedures must then be used to ensure that the developed test behaves in the way it has been specified to: this is called test validation.

The fairness of the test will also be shaped by aspects of test administration, such as ensuring standardised conditions for taking the test and a consistent approach to marking This is especially important when human raters are involved in assessing test performance.

Cambridge ESOL's contribution towards fairness in language assessment

As well as constantly pursuing fairness in its own examinations, Cambridge ESOL has also contributed to the development of testing standards and professional practice throughout the world as a founder member of The Association of Language Testers in Europe (ALTE). The ALTE Code of Practice (1994) provides a set of general principles that offer an explicit framework for reviewing the fairness of language tests. The Code generates a set of ALTE Minimum Standards (2001), or guidelines for good practice, relating to the specific areas of:

  • test construction
  • administration and logistics
  • marking and grading
  • test analysis
  • communication with stakeholders

The ALTE Code has gone on to inform other professional language testing standards such as the ILTA Code of Ethics (2001) and the EALTA Guidelines for Good Practice (2006).

How does Cambridge ESOL implement the ALTE standards?

Cambridge ESOL's systems contribute to fairness in each of these key areas of exam development and delivery:

Test construction Marking and grading
  • candidate information analyses
  • detailed test specification
  • in-depth item writer training
  • extensive pre-testing and item calibration
  • trialling of speaking and writing tasks
  • modified tests for test takers with special requirements
  • rigorous examiner training
  • marking and grading procedures, inc. checking
  • detailed appeals procedure
Test analysis
  • comprehensive routine post-test analyses eg: differential Item Functioning analyses
  • ongoing validation and evaluation studies eg: impact investigation
  • regular revision projects
Administration and logistics Communication with stakeholders
  • comprehensive test centre regulations
  • test centre staff training, management and monitoring
  • secure test despatch
  • secure and confidential test results
  • extensive support systems – web, hotline, etc.
  • ESOL website
  • sample/past test materials
  • regular stakeholder consultation
  • teacher handbooks
  • research publications
  • teacher seminars
  • conference presentations

But this process is not a once only event; after a test has been developed, approved and released, it needs to be evaluated and revised over time to ascertain that it continues to be fair. Also, by using the latest research and advanced statistical software we can make analyses and assurances about quality which would have been impractical in previous times. Cambridge ESOL's programme of continual evaluation and revision ensures all of our examinations meet the high standards of fairness, accuracy and reliability that we demand and upon which the three million people who take our tests each year rely.