Jon Powell (Assistant Professor in Engineering)

Every autumn, around thirty foundation-year students in our department write five laboratory reports in sequence. For years the pattern was the same — a careful Report 1, careful written feedback, a Report 2 that looked almost identical, and a cohort average that had barely moved by Report 5. Students were doing what we asked, the feedback we returned was specific and thoughtful, and it still wasn’t working.
Royce Sadler explained why over thirty years ago. Feedback can only improve a student’s work if they understand the standard being aimed for, can compare their current work against that standard, and can take some action to close the gap. A first-year student writing their first lab report has none of those things. They are new to the genre, they don’t yet know what a good report looks like, and comments arrive in a vocabulary they haven’t yet learned. I had been teaching the module as if my students arrived with the standard already in their heads, and I was wrong.
What I changed was small in design. In Report 1, I now provide the results, discussion and conclusions sections as worked examples and students now write only one section — the summary. They then evaluate their own writing and my provided text against the same rubric. In Report 2 they take on the conclusions, in Report 3 the discussion section, and by Report 4 they write the whole thing independently. The writing load rises as their fluency in the genre rises.
Alongside that, every report comes with a coversheet. Before I will accept it for marking, students have to identify strengths and points for improvement in their own work by reference to specific rubric criteria, cited by their code number. Vague self-praise — “this section was good” — does not pass. “Criterion 3.1.4 is met because…” does. The point is to force genuine engagement with the rubric at the level of named quality indicators, rather than as a general impression. Joanna Tai and colleagues call this evaluative judgement: the capacity to make informed decisions about the quality of one’s own work.
I have now compared three cohorts before this change with three cohorts after, around 130 students in total. Overall, marks on the final report are up by roughly eight percentage points. What I had not expected was how unevenly the gains were distributed. Marks on the discussion sections rose by about fourteen percentage points and on the conclusions sections by about thirteen. Marks on the results section did not shift at all.
On reflection, that is exactly what the theory predicts. Criterion-referenced self-assessment develops evaluative judgement, and evaluative judgement is what the discussion and conclusions sections demand. It does not develop the procedural skill the results section asks for — that is taught elsewhere, by a separate data-handling workshop common to all cohorts. The intervention is doing what it is designed to do, and only that.
I should be honest about another finding. The picture at the individual student level is more contingent than the cohort averages suggest. Some students deepen their engagement with the coversheet across the term, become noticeably better calibrated about the quality of their own work, and lift their marks accordingly. Others disengage; their coversheets shrink to a line of perfunctory text and their self-scores drift further from mine over the term. The design creates the conditions for evaluative judgement to develop, but it does not compel it.
Two design moves here are portable. The first — giving students worked examples to evaluate against the rubric before they have to produce their own — works in any subject with structured written assignments. A history module could provide a model paragraph of historiographical argument and ask students to apply the marking criteria to it before writing their own. A colleague convening a parallel laboratory module in our School has already adapted the approach and reports clearer first-submission reports as a result. The second move — that self-assessment must reference specific criteria by code — works anywhere a rubric exists.
The cost of both, taken together, is one A4 sheet per submission. What I would say to anyone considering this is that the work is not in designing the scaffolding. It is in being patient enough not to abandon it when the middle reports of the sequence are harder for students, not easier, than the first. That is a feature of the design, not a failure of it. The recovery, and the gains on the sections that matter, come at the end.
Further reading
Nicol, D. J., & Macfarlane-Dick, D. (2006). Formative assessment and self-regulated learning: a model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199–218.
Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119–144.
Tai, J., Ajjawi, R., Boud, D., Dawson, P., & Panadero, E. (2018). Developing evaluative judgement: enabling students to make decisions about the quality of work. Higher Education, 76(3), 467–481.
About the author
Dr Jon Powell is an Assistant Professor in Engineering at the University of Sussex, where he is Course Convenor for the Engineering Foundation Year and Head of Educational Enhancement. He is also the department’s Admissions Lead and Institution of Mechanical Engineers (IMechE) Accreditation Lead, and externally chairs accreditation panels for the Institute of Materials, Minerals and Mining and sits on its Accreditation and Professional Formation Committee.
His scholarship of teaching and learning focuses on inclusive assessment, the academic outcomes of widening-participation students, and translating professional standards into everyday pedagogy. Before joining Sussex in 2022 he spent over a decade as a lecturer at Chulalongkorn University in Thailand. He is a Chartered Engineer and a Fellow of the Higher Education Academy.
