In my contribution to last week’s IOE Debate asking ‘What if… we designed our school testing and assessment system from scratch?‘ I distilled what I think are 7 key principles that might help us shape our examination and assessment system differently.
Principle 1. That all tests and examinations can only ever be a proxy measurement, sampling what someone knows. All exam results will have measurement error. Exam boards try to minimise such error, by giving careful attention to issues of validity and reliability. However, in England, for GCSE and A levels, we do not know how questions will affect different subgroups of the candidate cohort, as questions are not trialled in advance because they might be leaked.
Of further concern are issues of bias and fairness which extend beyond the test paper and should include consideration of the opportunities that pupils have had to learn, (more…)
In response to the many criticisms levelled at England’s testing and assessment system, from its effects on children’s mental health to its impact on their learning, for our latest IOE debate we posed the question What if… we re-designed our school testing and assessment system from scratch?. To help us reflect on this provocation we were delighted to welcome: Ruth Dann, Associate Professor of assessment at the IOE; Tim Oates of Cambridge Assessment; Dave Mellor of AQA; and Ken Jones of the National Education Union and Goldsmiths. Their inputs sparked some lively Tweeting at #IOEDebates, and some great comments and questions from our audience.
If nothing else, today’s Ofqual report into this year’s GCSE reminds us that few things in education are more technically complex than assessment. The controversial report itself is a difficult document to navigate. The differences between marks, grades and awards, between syllabus content and specification structure, between coursework, controlled assessment and terminal assessment and the different things they can tell us are all a reminder that to construct, develop and manage an assessment regime is an enormous challenge. Ofqual picks its way through this complexity and has come up with a clear view: GCSEs went wrong in 2012 because the highly regulated system is overburdened. We expect too much of our assessment system, and as a result, our system drives perverse behaviour.
As Ofqual remind their readers, the English GCSEs which year 11 students completed this year were new: the GCSEs they replaced in 2010 had been in place for eight years and teachers and schools had become used to them. The replacement of coursework by controlled assessment – assessments completed in schools under controlled conditions – had been designed to address perceived problems of external help and plagiarism (para 1.48) – but threw up new challenges about the management of controlled assessment in school. For Ofqual, the results this year were a crisis of regulation and of complexity. They point out that the reliance on controlled assessment – 60% of the marks in English GCSE – placed a big emphasis on the role of schools and “we do not regulate schools” (para 1.49). The report heaps some blame on the now (perhaps, in the circumstances, conveniently) abolished Qualifications and Curriculum Authority for failing to grasp the “difficulties of maintaining standards in a set of new qualifications of such complexity” (para 1.48 again).
These are devastating conclusions. Ofqual claim that regulation failed at the point of specification design, and introduced a major unregulated component into the assessment system. For Ofqual’s numerous critics, this is a whitewash, shifting the blame for the crisis onto teachers who over-marked controlled assessments, and diverting attention away from Ofqual’s own regulation of key aspects of the system – including the moderation of controlled assessment: essentially, examination board moderators did not cavil at schools’ marks. No-one reading the report from a dispassionate perspective can feel satisfied about the regulation and management of a complex examination system.
Tucked away in the report is perhaps the most important sentence: “We have found evidence that this [the use of examination thresholds at grade C] can lead to undue pressure on schools in the way they mark controlled assessments. A recurring theme in our interviews with schools was the pressure exerted by the accountability arrangements, and the extent to which it drives teachers to predict and manage grade outcomes” (para 6.3).
Over the last 30 years, we have placed greater and greater weight on grade boundaries: they determine not only children’s futures, but also the fate of schools and, increasingly, individual teachers’ career progression. Schools below threshold are subject to intervention strategies and may be taken over For teachers, the mooted possibility of performance related pay systems would simply lay greater emphasis on the importance of examination results.
I blogged earlier this year about the infamous Atlanta testing scandal in the United States, where cheating became endemic because of the rewards for “success”. We have, collectively, to reflect now on the school accountability system, and whether a crude examination-led accountability system is not always going to lead us into difficulty. Once again, Campbell’s law is vindicated: “The more any quantitative indicator is used for decision-making, the more subject it is to corruption pressures and the more apt it will be to distort the processes it monitors”.
If nothing else, the Ofqual report might put another nail in the coffin of the current school accountability system. Schools need to be held accountable, and the highest standards of attainment matter – but we appear to have created a system which drives the most perverse behavior – “cheating” as one highly respected journalist puts it.
Teachers are angry about the Ofqual report. They believed that they were acting not only professionally and morally but also with great technical accuracy. No-one who has examined the extra-ordinary sophistication of schools’ data tacking systems can fail to be impressed. They believed that they were doing what they were expected to do: using all their data internally and externally to map progress, to monitor performance, to predict outcomes and to design interventions. I’m lucky: I get to talk to teachers, school leaders and policymakers from around the world. They are in awe of the technical abilities displayed in monitoring performance which are routine in English schools. They understand that our information and performance systems are exceptional and our schools highly skilled.
Informed commentators in England, such as John Dunford, have argued that the time has come to move away from a system of external assessment to one based on internal assessment led by chartered assessors. Implicitly, the Ofqual report appears to make this more difficult. Its strong undercurrent – and another reason for the widespread professional anger – is that regulated assessment cannot be left to schools. That feels a disappointment, because properly conducted internal assessment can be much richer and more productive than most external examinations.
The Ofqual report is technically complex, and fascinating reading for those absorbed in the complexities of assessment, but it fails to pose really tough questions about the long-term future of assessment in England. It sets out the challenges of running a modern assessment system without really making the point that complexity is inevitable; it accurately highlights the consequences for schools of the over-emphasis on single accountability measures, but it does not yet pursue the logic of this for the long-term development of assessment systems in England.
Perhaps this is because of a structural flaw in the makeup of Ofqual: it is, after all, a regulator. But there is enough in the report which documents the systemic failures of regulation and the perverse behaviour driven by the overlap between our assessment and accountability systems to be clear that something needs to be done. We need a full scale, politically neutral review of our education accountability framework.