It is not without reason that a recent article by Thomas Corbin and others has become the most read article over the last year in that specific journal, with more than 25000 reads in a period of nine weeks, since publication online on 15 May of this year. The article is about assessment – one of the hot topics in higher education contexts, and it is indeed a game-changing assessment of current assessment practices and a call towards a new future.

The journal is Assessment & Evaluation in Higher Education. Under the journal’s heading Trending, this article is number one since a few weeks already. In addition, the high Altmetric score of 60 gives an indication of the online attention and engagement this scholarly article has received.

The title of the article is ‘Talk is cheap: why structural assessment changes are needed for a time of GenAI’. It addresses the topic of assessment at a very fundamental level. In time, this article might possibly turn out to be a pivotal point in the scholarly engagement with assessment in higher education.

Assessment has been centre

stage of discussions of academics since the appearance of ChatGPT on 30 November 2022. Initially, to protect legacy forms of assessment, the reaction of many institutions worldwide was to ban, restrict or discourage the use of GenAI in academic work.

Over time, several two-lane approaches developed – either by ignoring the use of GenAI in the belief that final examinations would filter out inappropriate use of GenAI during the various assignments, or by allowing approaches in which some training took place on the appropriate use of GenAI. These two-lane approaches were mostly shaped in what came to be called ‘traffic light’ approaches, in which student academic activities were classified in the three colours of traffic lights. Red indicates that student use of GenAI for such activities is inappropriate; green indicates that all such activities are acceptable; and amber typically indicates a category where permission for GenAI use must be obtained. In addition, students were often required to declare their use of GenAI in assignments, sometimes in relation to these categories.

In essence, the two-lane approaches also strive for the protection of the integrity of the learning process and the assessments to be undertaken, even with some smaller changes to assessment practices.

Gradually, in some learning contexts, the three colours became five, with additional categories for borderline uses of GenAI, for example in the AI Assessment Scale (AIAS), which was revised by the authors in 2024, with five finer categories relating to current use of GenAI by students (and with colours not reminiscent of traffic light colours).

In the article by Corbin and others, they take the position that GenAI “challenges assessment validity by enabling students to complete tasks without demonstrating genuine capability”. Guidelines can hardly be water-tight and are interpreted or rationalised by students in various ways, often different from the intentions of the lecturers. This leads to what they designate an ‘enforcement illusion’.

The authors make a distinction between ‘discursive’ methods and the ‘structural changes’ for which they plead. Discursive methods rely on categories of use or non-use of GenAI, create ambiguity and misunderstanding, are very difficult to enforce in consistent ways, could endanger assessment validity and might affect institutional reputation.

We rather require, they maintain, “a shift towards structural assessment redesign that builds validity into assessment architecture rather than attempting to impose it through unenforceable rules”. The authors then further elaborate on what is required, namely: “Modifications that directly alter the nature, format, or mechanics of how a task must be completed, such that the success of these changes is not reliant on the student’s understanding, interpretation, or compliance with instructions. Instead, these changes reshape the underlying framework of the task, constraining or opening the student’s approach in ways that are built into the assessment itself”.

In their groundbreaking publication, Corbin and colleagues clearly set the objectives for the work to be done regarding assessment and spells out the implications for educators. From their closing remarks:

“The path forward through this increasingly challenging terrain lies not in more sophisticated rules about AI use, but in fundamentally redesigning how we structure assessments to demonstrate student capability. This will require significant effort and creativity from educators but has the advantage of allowing for genuine solutions to maintaining assessment validity in an AI-enabled world….” (my italics)

The article by Corbin and others is indeed a must-read for both university managers and lecturers alike.

 

Walter Claassen (SARUA Associate)