I’m still reading through the Ofqual report in the this summer’s GCSE English examinations and trying to find the holes (of which there are many) and work out which ones are the most important.
I think one of the biggest holes in their analysis of the problem is right in our faces on the summary page. They claim:
The report acknowledges that, knowing what we know now, regulators and the exam boards could have done more. Some controls, especially moderation tolerances for controlled assessment, could have been run more tightly and communication could have been better in various ways. However, once this GCSE design had been implemented, Ofqual does not believe that the problems seen this summer could have been eliminated, no matter how much more tightly these qualifications had been managed.
Before we go anything further let me say this: The biggest problem this summer (with AQA) was not with the controlled assessment. The number of extra marks required for a C on the controlled assessment component (worth 60%) was 5. On the Unit 1 exam (worth 40%) the number of extra marks required for a C was 9. This penalised massively those students who were sitting the exam for the first time and those students who had done badly in previous sittings and we were resitting. This combined with the changes to the CA mark led to devastating results in some schools.
On Friday the media blamed teachers under pressure, but behind that attention grabbing storyline the real situation is very murky. If teachers were marking too high and the tolerance was too wide then it makes it very easy for Ofqual to shirk any responsibility. Their response to the apparent increase in grades was to request the exam boards to move the grade boundaries up. As I’ve already discussed on this blog they couldn’t change grade boundaries for previous exams, so they had to adjust this summer’s boundaries. The problem with this is that THIS affected all schools, regardless of whether they were perceived to have manipulated their marking or not.
If you don’t understand how the moderation process works, it’s basically like this:
Moderation involves schools sending a sample of students’ work to a moderator who checks the marking. They select a sample of the students they’ve received from each school and check it. If they agree (within tolerance) with the centre’s marks they record the candidates whose mark they have moderated, the number of marks the centre awarded the candidate and how many marks the moderator felt the folder was worth. If they don’t agree with the centre’s marks then they moderate an extended sample of students’ work and then record their marks and the centre marks on the MMF. Either way the exam board get what the moderator thinks the marks should have been, regardless of whether they were in tolerance or not.
This is important because Ofqual are claiming that the tolerance allowed schools to manipulate the results and over mark. If this was the case then the exam boards could have reduced the tolerance and adjusted individual schools controlled assessment marks rather than adjusting the grade boundaries which affected every school in the country. This would have penalised the schools that were apparently manipulating the system, rather than EVERYBODY.
It’s not just me or other English teachers who are frustrated with the situation. I’ve had an email from an AQA moderator who moderated both GCSE English and English Language and Literature. Their response is published below in full and unedited:
So Glenys Stacey thinks that the blame for the furore over the 2012 exam results in English lies squarely at the feet of the teachers who, she says, overmarked their students’ controlled assessments. Moderation did not solve the problem because of the “tolerance” rule which means that moderator marks and centre marks can disagree – as long as they are within tolerance, the marks will not be adjusted.
As with everything that I do, I took this job very seriously. I was acutely aware that I was going to be passing judgement on the work of colleagues as well as on the work of students, and I made sure that I was thorough in my work as well as accurate. Frequently on the phone to my team leader, I sought advice when I felt that I needed to, and referred the most difficult cases up the line to her. My reports to centres were detailed – not one was under a side in length – and referred to strengths as well as areas for development. I wanted to ensure that the work of the centres was recognised in the report; as a teacher myself, I know just how hard teachers work to get their controlled assessments marked accurately and off to the moderator.
What now seems to be in question is what we mean by “accurate”. If we take English Language as an example, there are a total of 80 possible marks over 4 tasks. To add to the complication, one additional mark is awarded for a combination of two pieces. This gives the teacher five marks to award. With a tolerance of just 5 marks, this means that the teacher only has to be out by one mark on each task for their mark to be at the upper limit of tolerance. In English, tolerance is 6 marks in 5 tasks with a possible 90 marks to award. In Literature, there is only one task, worth 40 marks. Tolerance is just 3 marks for a task so complex that teachers up and down the land have struggled to make sense of it. With mark schemes as subjectively as ours, one person’s “clear and consistent” is another’s “confident and assured”.
What Ms Stacey is suggesting is that the “over marking” was deliberate. I disagree. When I think back to May and June, what I most clearly remember is the sheer amount of work that was evident in the folders. Almost without exception, they were annotated in detail to show how marks were awarded. Almost without exception, they were accurately marked. Almost without exception, the task setting was excellent.
What I believe happened was this: teachers did the best job they could with the limited information and complex mark schemes that they were given. Tolerance rules mean that, although some students got more raw marks from the centre than the moderator might have given them, they got to keep those marks. Of course, there were also candidates who got fewer marks for the same reason. Where centres were out of tolerance, that information was passed up the line and those centres were adjusted accordingly.
Moderators not only send a comprehensive report of their marking to the centre, but also to the awarding body. They tell them the centre mark and the moderator mark. If tolerance was such an issue, why was this not picked up and altered sooner? It would have been straightforward enough to move the tolerance down, say to 4 marks instead of 5. Those centres who were then out of tolerance could have been adjusted whilst those still within would have retained their marks. This would have been a fairer way of managing the situation and a straightforward task given the quantity of information held by the awarding body.
What my centres got from me was detailed feedback with comments on their admin and annotation, task setting and planning sheets, and on individual pieces of work. I asked myself “If I were them, what would I want to know?” and wrote my reports accordingly. Where they were only just within tolerance, I told them that. Now that tolerance is likely to be reduced, I hope they’ll be glad of that information. Our moderator was less thorough: the report refers to being within tolerance but doesn’t give a clue as to what degree. This could be a problem for us.
I think that says it all.