X Close

IOE Blog


Expert opinion from IOE, UCL's Faculty of Education and Society


Testing times: how can we build a system that will assess what we value?

By Blog Editor, IOE Digital, on 13 November 2018

IOE Events.
In response to the many criticisms levelled at England’s testing and assessment system, from its effects on children’s mental health to its impact on their learning, for our latest IOE debate we posed the question What if… we re-designed our school testing and assessment system from scratch?. To help us reflect on this provocation we were delighted to welcome: Ruth Dann, Associate Professor of assessment at the IOE; Tim Oates of Cambridge Assessment; Dave Mellor of AQA; and Ken Jones of the National Education Union and Goldsmiths. Their inputs sparked some lively Tweeting at #IOEDebates, and some great comments and questions from our audience.

Ken Jones took up the spirit of the ‘What if…’ debates as an invitation to some utopian thinking. He transported us to 2022 and a world in which statutory testing before the age of 16 had been suspended, with assessment’s role in school accountability replaced by sample testing to monitor national trends and a greater emphasis on school self-evaluation. This was a world in which assessment was focused on supporting pupils’ learning, in a context in which education was a collective endeavour, not a fight for pole position. Ruth Dann’s call for more criteria- rather than norm-referenced assessment
chimed with these sentiments.
Back to 2018, and we learnt from Dave Mellor just what a mammoth task reforming an assessment system is – a task that requires a lead-in time of at least 10 years (and, accordingly, a judgement of what we’ll need a decade hence). We also heard from Tim Oates that reform of one part of the assessment system has implications for another (for instance, it is the GCSE and A-level system that allows for England’s high-quality three- as opposed to four-year bachelors degree).
Bearing those caveats in mind, how might we move towards a testing and assessment system that is less easy to caricature?  If England doesn’t have that much more high-stakes assessment than many other countries, why does it hang so heavily over our system?
Perhaps the key message was the need to put the curriculum first and build assessment from that, rather than building the curriculum from qualifications, as it currently the case. We need to decide what we value in pupils’ learning and to assess what matters in that regard.
From there, the elephant in the room is the relationship between testing and school accountability.  We are yet to see how Ofsted’s recently announced intention to place more emphasis on curriculum breadth in school inspections will play out in practice.  But we can hope that this move will help the system pay greater attention to a wider range of skills, especially those that are less easy to assess – not least the ‘soft skills’ that we’re told are becoming all the more important in the age of AI.  These are skills that are very difficult to assess, especially on a summative/high-stakes basis, and so are at greatest risk of being squeezed out if we’re too busy ‘treasuring what we’re measuring’.
Advances in the very technology that seems to be making human workers obsolete may offer a means of better assessing those skills in future. In the meantime, the pervasive impact of linking testing and assessment to school accountability remains evident in the outcome of the removal of the ‘levels’ that had been introduced alongside the National Curriculum. The policy was meant as a means of freeing the system from constraints.  In practice, levels have simply been replaced by age-related expectations, which has resulted in more intensive tracking, exacerbating dysfunctional labelling and short-term or shallow learning.
In one of the more unexpected interjections for a debate on assessment, we had reference to the film classic Casablanca – specifically, the character Captain Renault and policy makers’ shared practice of expressing shock and disdain at something they are complicit in (to save you looking this one up too, here’s the clip). The need for school accountability is not in question. But it’s not enough to offer well-meaning steering from on high about not ‘teaching to the test’ in a system in which all the drivers and pressures are in the opposite direction.
So, the outcomes of assessment need to be used sensibly. The other point that united our panellists was the need to improve everyone’s assessment literacy – teachers’, pupils’, parents’, employers’, and the general public’s. Testing and assessment are complex matters, made more so by the different purposes that they are asked to serve. The more we all understand about the technical and political aspects of our assessment system the better.
Turn over your test paper, you may begin.
Watch or listen back to the debate in full hereOur next debate looks at the curriculum – find out more and book your free place here.

Print Friendly, PDF & Email

One Response to “Testing times: how can we build a system that will assess what we value?”

  • 1
    John Mountford wrote on 13 November 2018:

    Clearly, successive governments have held SATs as something of a gold standard when comparing schools’ performance nationally. In turn, Ofsted has used the data published in league tables to hold schools to account and rank them accordingly. All this rests on whether these tests produce accurate and reliable outcomes. Whether they add to the cognitive development of pupils is quite another debate for another time.
    Aware that increasing numbers of secondary school are using cognitive ability tests at the beginning of Yr7, my colleague, Roger Titcombe, and I decided to compare these with the SATs scores obtained earlier. We wanted to know whether there was any conflict in the results thrown up. Further, we believed any mismatch might affect certain pupils more than others. For instance, we know from data published by GL Assessment, who provide the Cognitive Ability Tests, that FSM children, on average, have lower cognitive abilities. Our intention was to investigate whether this would be borne out in our analysis.
    What we found has clear implications for setting Progress 8 and Attainment 8 targets for all secondary schools, but especially for those with intakes skewed towards the lower end of the ability range with higher numbers of FSM pupils.
    As we postulated, our completed analysis of this small scale research shows SATs scores are generally inflated compared to cognitive ability scores. However, the evidence obtained clearly indicates that the inflation effect is greatest for those pupils of lower cognitive ability. Furthermore, the lower it is, the greater the effect.
    We further established that for FSM pupils, SATs are slightly depressed in relation to the cohort mean but when their cognitive ability scores are taken into account, these same pupils score significantly lower. This can be seen in the attached Table 1.
    In 2016 the reporting of SATs was changed. The concept of National Curriculum levels was abandoned. The DfE now report SATs results on what it refers to as a standard scale with a mean of 100, a minimum of 80 and a maximum of 120. The explanation of how these are reached remains questionable. A set of conversion charts is published annually for use by testers. These charts are subject to change each year under direction from the Secretary of State for education and convert the raw test marks to a score on the 80 – 120 scale. There is no statistical validity for these data. They are recorded thus, without reference to internationally accepted processes for age related standardisation. The fact is, as norm referenced tests, SATs results are highly influenced by cramming and coaching, which has been recently confirmed by Amanda Spielman, Chief HMI. https://www.bbc.co.uk/news/education-45560165
    That this effect is acknowledged by the Chief HMI adds significantly to the validity of the findings from our research. Having now taken up the matter with The Rt Honorable Nick Gibb, we are awaiting his response. We believe the impact of this work justifies a broader analysis of the phenomenon. We will be asking the DfE to commission such from an independent researcher.
    Very recently a set of three articles on the pupil premium was published on her website by Professor Rebecca Allen, Professor of Education at UCL Institute of Education. It is difficult to overstate the potential of this work to have a significant impact on where we should go next in making sure the English education system is truly fit for purpose in the 21 century. With this in mind I would like to draw attention to Roger’s summary as published in his blog, Learning Matters.