Accountability: just what do we want to measure?

Blog Editor, IOE Digital27 December 2012

Chris Husbands
Over the last 20 years, secondary school performance measures have had an enormous impact on schools’ behaviour, parental preference and, indeed, local house prices. Published in local and national league tables, they have been based on the proportion of a school’s students who secure 5 A*-C GCSEs in English, mathematics and three other subjects.
Just before Christmas, in what appeared to be a leak of the Government’s review of secondary school accountability, the Daily Telegraph reported that this measure will go, to be replaced with an average points score. For example, 8 points would be given for an A*, 7 for an A and so on. Unfortunately, and hopefully not on the basis of leaks from the DfE, the Telegraph spoilt its story with two schoolboy howlers: first by suggesting that this provided a “more precise calculation of achievement”, rather than a measure of examination attainment, and second by arguing that this would prevent “over-generous marking” by teachers  – clearly, the way scores are reported for accountability purposes says nothing about the assessment methods which make them up.
In principle, it makes good sense to hold schools to account for the progress made by all their pupils rather than the sub-set who achieve the highest grades. On this principle alone, the Telegraph-reported proposal is welcome and a big advance on current arrangements. At present, the grade C threshold encourages schools to focus attention on that threshold to the detriment of other thresholds, and certainly to the detriment of reporting the progress made by all children. Effectively, we have been using a secondary school performance measure which relates to the performance of only three fifths of young people, which may be not unrelated to our ingrained problem of the long tail of low performance. The idea of calculating a points score is an advance.
As ever, though, in matters of assessment policy, things are not that simple. In the first place, the measure as reported treats grade boundaries as single steps up an equally calibrated staircase – the step from a G to and F  would be treated as the same step up (one step) as from a C to a D or from an  A to an A*. But this is misleading on several grounds. As any grade boundary archive makes plain not all grade boundaries are the same size. Normally, the critical C boundary is set, and then other boundaries are derived statistically based on deviations from the C boundary. Some boundaries are then set as equal steps between marks, but not all are. The conversion of grades to numbers for the purpose of deriving a total average grade assumes a statistical pattern which is not there in pupils’ performance: not all grades are simple steps up in marks.
It gets more complex:  although it is important to hold schools to account for the progress made by all pupils, in practice the C/D borderline is important for post-16 progression.  Getting a C in maths allows a pupil to progress in ways that getting a D doesn’t, but a B opens very few additional progression possibilities which a C does not.  Whether the C boundary should be as important as it has become can be debated, but it does matter – and internationally, the idea of thresholds for functional numeracy and facility in the national language is gaining currency. The C/D borderline measures this – crudely and ineffectively in all sorts of ways – in the way that a measure across the attainment range does not. American schools report their graduation rates – and some students take longer than others to graduate; American graduation rates are reporting a threshold, and they are often fairly incurious about performance above the threshold. High school graduation, as too many teen movies bear witness, is graduation.
There is a further difficulty. Most observers argue that schools should be held to account for the progress made by the pupils they teach, and  it has long been pointed out that the performance of some schools is flattered by the focus on the proportion securing 5 A*-C GCSEs:  there are schools which should be doing much better than they are given their intake. The proposed calculation, although it is a step forward, is still not a progress measure. For schools, the key indicator is not the measure of the overall attainment of a cohort, but the measure of levels of progress from entry to exit. That is a much more genuinely inclusive measure. But even that is complicated as the education system is gently tilted back towards norm- rather than criterion-referenced assessment methods, so that not all pupils may be able to make three levels of progress.
And there is one more complexity, which matters if you accept that accountability measures can drive perverse behaviours. The focus on the C threshold may encourage schools to invest considerable resources at the C/D borderline, but the concern with the number of students reaching a threshold does force schools to be concerned with the performance of individuals. Basing accountability on average scores shifts the focus from individuals to grades. Most of us want schools to be concerned with outcomes for individuals.
The Telegraph report reminds us that “accountability” for performance in education is complex. Developing measures which genuinely allow schools to demonstrate what they have achieved with young people is complex. Translating it into a readily understood format which can be communicated clearly is perhaps even more complex. At root, society needs clarity about what it wants to hold schools to account for: the progress made by individual pupils, in which case we should worry less about thresholds, or their ability to move all pupils to an agreed threshold, in which case we should worry less about above threshold performance, or their ability to push the most able to elite levels of performance, in which case we need to reflect on how to map the performance of all. Until we clarify that, we will struggle with inadequate measures in which we vest too much confidence