Exams are a well-established method for assessing academic achievement; often trusted by both academics and students to be a valid and reliable measure of achievement. Academic institutions like the efficiency of delivery and grading, and their facilitation of objective and easily documented and standardized measurement. Students feel safe with the format of an exam, in the UK and in many other contexts students will have been taking exams throughout their school years and will be well versed and sometimes very skilled in the art of understanding what is required from an exam and demonstrating precisely that, and nothing superfluous to it.

However, at the same time and increasingly so in recent years, exams have also been scrutinized and criticized for reinforcing inequalities, lacking authenticity in terms of what students will be expected to do in their future academic and/or professional careers, and for not providing opportunities to demonstrate depth or application of learning.

As academic institutions continually work to enhance teaching, learning and feedback; the question over whether exams are the best and most reliable tool for assessing outcomes is part of a long and ongoing conversation about broader assessment and feedback practices.
This article provides some points for reflection when deciding whether an exam is an appropriate mode of assessment for your context, drawing heavily on a recent conversation between Dr Mary Richardson, Professor of Educational Assessment at UCL’s Faculty of Education and Society, and Ishan Kolhatkar, Global Client Evangelist at Inspera.

Why Are Exams Such a Core Part of Assessment?

Exams are often criticized for being neither authentic nor relevant, and for only effectively assessing shallower aspects of learning such as basic knowledge recall.  The addition of time constrained delivery without access to other resources is also criticized for causing stress to students and therefore preventing them from demonstrating their true achievement and potential. 

On this final point, there can be particular concerns regarding inequitable access for different genders and ethnicities.  If an assessment needs to be an authentic representation of what the student will be expected to do in their future academic or professional careers, then an exam probably will not be the best format. 

So why, then, are exams so widely used?

Professor Richardson comments that  there are probably three key reasons why an exam might be chosen:

  1. Efficiency for student application of effort. An exam is a one time event, and it is possible to learn to be very good at the skill of taking an exam. 
  2. Cost to the institution. Administering an exam is a relatively cheap way to manage assessment. A cohort of students sits the same paper at the same time, and then it is made available for marking to all of the markers.
  3. Subject/Field specific requirements. Some subjects will probably always require an exam. For example, surgeons, lawyers, pilots – these people need to take high stakes high pressure tests as this is what they will need to be able to do in their careers, as well as having ready access to knowledge. These tests have to be high stakes and highly standardized.

Closed book exams in secure conditions can also provide a robust response to concerns about the unauthorized use of AI.  Added to this, some students will be resistant to more creative assessments as they trust the format of an exam. They are familiar with them as a measurement tool from a very young age, and they may perceive them as being more ‘fair’ as everyone has the same requirements made of them in the same conditions.

When Are Exams Not Appropriate as an Assessment Tool?

As Professor Richardson explains, ‘… assessment shouldn’t be used inappropriately in a way that could harm the test taker’. A standard exam that hasn’t changed much over the years could be doing exactly this by denying the student the opportunity to provide evidence of their abilities.

Some people are better able to learn the skills for performing well in an exam, and will therefore be at an advantage in an exam situation over their peers. This means that the underlying construct (theoretical knowledge or skills being assessed) of an exam paper could include the ability to take an exam. 

Where an exam is well written, then this might be appropriate (e.g. surgeons, pilots), otherwise it is inappropriate and harmful for a person to require good exam skills in order to be able to achieve the stated learning outcomes of their course of study – which are unlikely to include ‘exam skills’. The student is really most often left to their own devices here to use skills that have been honed over years at school, or specifically learned outside of their field of study, and the risk is that students without exam taking skills are not being provided with an equitable opportunity to show what they can do.

A test taker is further harmed by exams as they are unlikely to provide them with the opportunity to create something personal to them. Being able to create a more personal artifact through coursework has the potential to be both highly motivational while enabling the student to demonstrate their individual key learnings and how they are able to apply them in a real-world context.

How to Decide If Exams Are the Right Choice

While recognizing the need for more authentic, inclusive and meaningful assessment, how can institutions make the decision over when to appropriately use an exam to measure learning outcomes?

Professor Richardson advises asking yourself one core question: ‘What do I want to know about what my students know when they have finished my course of study?’. For example, do you need to know that they ‘know’ as in have ready access to certain facts? And/or that they have this knowledge and can critically evaluate it? And/or that they can explain it to someone else, or apply it in a range of contexts, or one specific context? 

The first question is an apparently simple and yet very complex question. It forces the educator to consider explicitly what the outcomes are in terms of what the student actually knows and can do with that knowledge, which then forces the creation of an assessment that will enable the student to demonstrate this. 

Assessments need to be evaluated to ensure that they don’t assess something that is not a required outcome, and that they actually give the students the opportunity to demonstrate what they have achieved without artificial and/or unnecessary restrictions. Flint and Johnson (2011) investigated fairness of assessment in an Australian university and found that a key factor affecting student’s perception of fair assessment was the [lack of] opportunity to demonstrate their abilities. 

This all reinforces the point that the educator needs to understand what they want to know about what their students know. Understanding this is fundamental to the design of assessment, and best practice would include helping students to understand this too.

When evaluating whether the best form of assessment for their context is an exam or not, educators should remember the power that assessment has in terms of its impact on the behavior of both students and academics. What is assessed is implicitly what is valued, and how to meet those requirements will be the focus of a great deal of energy and can be a significant motivator.

How to Improve Assessment Practices

There is broad recognition that assessment practices are not always appropriate and are indeed sometimes even harmful, and it seems likely that exams are not always the best form of assessment despite having been used for many years. However, there is a resistance to changing well established practice, which is perhaps not surprising given the expectations that there are for assessments to perform multiple functions. 

Good assessment needs to be rigorous but not exclusive, to be authentic yet reliable, to be exacting while also being fair and equitable, to adhere to long-established standards but to reflect and adapt to contemporary needs’ (Hounsell, Xu and Tai, 2007, p.1). The realization of these multiple layers of double duty and the seemingly unsurmountable contradictions often hampers progressive practice.

Professor Richardson mentions a need for bravery within the mainly risk averse context of higher education. She explains that while a lot of work has been done on assessment for learning and more effective use of assessments, policies can sometimes block student progress from being made. 

Professor Richardson talks about the moral panic in universities caused by Chat GPT which forced many institutions to examine assessment practices across the entire institution to consider the likely problems/risks of Chat GPT and how they could best be mitigated. However, she argues that most people don’t want to cheat, some will, but most want to learn and to demonstrate their learning. Like almost everyone in their professional lives, we know that students will use large language models which have many positive affordances that can be built into assessments. 

For example, by asking students to use large language models to create an artifact and then evaluating and incorporating that artifact into their own work. Enabling this demonstrates good practice and pedagogy to students as AI is a fact of academic and professional life, and it fosters trust and accountability. 

Professor Richardson talks about nudge theory, and how small activities can lead to wider changes, such as conversations with people about the changes they would like to see, what their dream is and what can be done now. She goes on to explain that the greatest power comes from involving students in this work and asking them what will be most useful to them when they become a professional in their chosen field. 

Ishan Kolhatkar describes his previous practice as an academic where he co-designed a module and assessment with learners, and where he provided a cohort with learning outcomes and asked them to provide evidence that they had met them. For change to be effective, there needs to be commitment from senior management and champions of the initiative (Price, 2013). Therefore, understanding who to have these conversations with and the type of conversation that is needed will be important.

Ishan raises the question during the conversation about how we satisfy regulators that assessments are of equal integrity and difficulty where we have provided this level of flexibility. Professor Richardson explains that the work still has to be done to evidence alignment, but if the course has really clear learning outcomes then this alignment should be possible. 

Do Exams Have a Place in a More Holistic Assessment Strategy?

If an exam is an appropriate assessment, then it does have a place. Coursework or continuous assessment may provide many opportunities for assessment for, as and of learning, and it could be that an exam as an assessment of student learning at a specific point in a program of learning could be the most appropriate format. 

An exam, therefore, could be reserved for specific points within a more creative palette of assessment. For example, a historian will have access to a plethora of resources when curating a museum exhibition, and considering how best to engage members of the public and foster collaboration with other institutions. However, when they are in a meeting with another institution to discuss, say, sharing of resources or how exhibitions might inter-relate, then they need to have access to ready and in-depth knowledge of the subject matter i.e. the artifacts and how they sit within a socio-cultural context. 

Therefore, a diverse menu of assessment could include authentic assessment such as development of an exhibition and associated activities, as well as an exam that requires access to knowledge about history and artifacts within a specified time. While the exam format may be considered to be inauthentic, it would be valid and authentic in this context in that it demands ready access to specific knowledge.

A more holistic approach to assessment is likely to involve more choice for students over how they are assessed, and it should be remembered that an exam might also be the assessment format of choice for students who are good at taking exams and feel comfortable with them.

It is also the case that certain regulatory bodies will require evidence that an exam has been securely delivered, and that there is absolute confidence that the performance is the student’s own with no external support other than authorized resources where applicable. More holistic or authentic assessment will provide scope for support whether that is from AI or another person. Arguably, an exam that is administered under secure conditions is the only way to be absolutely sure that the performance is the student’s own unsupported efforts. 

Professor Richardson agrees to some extent with the assertion that more holistic assessments will mean that students may use AI, but reminds us that historically people have always been afraid of change and in times of great change tend to focus on negative impacts. When we are caught up in this cycle it is a good time to go back to her core question:  ‘What do I want to know about what my students know when they have finished my course of study?’ and think about whether this can include their judicious use of AI where that is appropriate.

Should Exams be Designed Out of Assessment Practice?

Quality assurance in institutions will undoubtedly continue to focus on the verifiable outcomes of assessment, and require assessments used for summative purposes to be robust and consistent with practice within that institution and across other institutions in higher education. This in turn leads to a focus on readily quantifiable measures of student outcomes and accountability, which can inhibit changes to more traditional exams. But this does not mean that assessments cannot be more holistic in their design, only that, as Professor Richardson states, learning outcomes must be clearly identifiable and demonstrated. 

Within a mixed and varied diet of assessments which provide ample opportunities for students to demonstrate their mastery of their subject, exams still have their place where it is appropriate for students to demonstrate knowledge in a constrained environment.  However, where exams are not appropriate Professor Richardson warns us that they are known to impact on the achievement of students from certain backgrounds, and to favor the people who are just better at exams and who have benefitted from this since they began school education. 

There is a place for exams within a framework of holistic assessment, but care should be taken to ensure that they are used appropriately and without harm to students.

Fiona Orel, Senior Account Manager UK&I, Senior Fellowship Advanced HE and Former Educator

Resources

  • Instructure webinar ‘The Exam is Dead, but is it Really?’ (12 November, 2024) Hosted by Sidharth Oberoi, Vice President International Product Strategy at Instructure, with Ishan Kolhatkar, Global Client Evangelist at Inspera in conversation with Dr Mary Richardson, Professor of Educational Assessment at UCL’s Faculty of Education and Society. 
  • Carless, D. (2015). Excellence in University Assessment. Oxon: Routledge. Preview Available. 
  • Flint, N. and Johnson, B. (2011). Towards Fairer University Assessment: Recognizing the Concerns of Students. London: Routledge. Preview Available. 
  • Price, M. (2013). Fostering institutional change: Overview. In S. Merry, M. Price, D. Carless and M. Taras (eds), Reconceptualising Feedback in Higher Education: Developing Dialogue with Students (pp. 145-146). London: Routledge. Preview Available.