Joyful assessment: beyond high-stakes testing

Here are my slides from my presentation at the Innovate Learning Summit yesterday. It’s not world-shattering stuff – just a brutal attack on proctored, unseen written exams (PUWEs, pronounced ‘pooies’), followed by a description of the rationale, process, benefits, and unwanted consequences behind the particular portfolio-based approach to assessment employed in most of my teaching. It includes a set of constraints that I think are important to consider in any assessment process, grouped into pedagogical, motivational, and housekeeping (mainly relating to credentials) clusters. I list 13 benefits of my approach relating to each of those clusters, which I think make a pretty resounding case for using it instead of traditional assignments and tests. However, I also discuss outstanding issues, most of which relate to the external context and expectations of students or the institution, but a couple of which are fairly fundamental flaws (notably the extreme importance of prompt, caring, helpful instructor/tutor engagement in making it all work, which can be highly problematic when it doesn’t happen) that I am still struggling with.

Evaluating assessment

Exam A group of us at AU have begun discussions about how we might transform our assessment practices, in the light of the far-reaching AU Imagine plan and principles. This is a rare and exciting opportunity to bring about radical and positive change in how learning happens at the institution. Hard technologies influence soft more than vice versa, and assessments (particularly when tied to credentials) tend to be among the hardest of all technologies in any pedagogical intervention. They are therefore a powerful lever for change. Equally, and for the same reasons, they are too often the large, slow, structural elements that infest systems to stunt progress and innovation.

Almost all learning must involve assessment, whether it be of one’s own learning, or provided by other people or machines. Even babies constantly assess their own learning. Reflection is assessment. It is completely natural and it only gets weird when we treat it as a summative judgment, especially when we add grades or credentials to the process, thus normally changing the purpose of learning from achieving competence to achieving a reward. At best it distorts learning, making it seem like a chore rather than a delight, at worst it destroys it, even (and perhaps especially) when learners successfully comply with the demands of assessors and get a good grade. Unfortunately, that’s how most educational systems are structured, so the big challenge to all teachers must be to eliminate or at least to massively reduce this deeply pernicious effect. A large number of the pedagogies that we most value are designed to solve problems that are directly caused by credentials. These pedagogies include assessment practices themselves.

With that in mind, before the group’s first meeting I compiled a list of some of the main principles that I adhere to when designing assessments, most of which are designed to reduce or eliminate the structural failings of educational systems. The meeting caused me to reflect a bit more. This is the result:

Principles applying to all assessments

  • The primary purpose of assessment is to help the learner to improve their learning. All assessment should be formative.
  • Assessment without feedback (teacher, peer, machine, self) is judgement, not assessment, pointless.
  • Ideally, feedback should be direct and immediate or, at least, as prompt as possible.
  • Feedback should only ever relate to what has been done, never the doer.
  • No criticism should ever be made without also at least outlining steps that might be taken to improve on it.
  • Grades (with some very rare minor exceptions where the grade is intrinsic to the activity, such as some gaming scenarios or, arguably, objective single-answer quizzes with T/F answers) are not feedback.
  • Assessment should never ever be used to reward or punish particular prior learning behaviours (e.g. use of exams to encourage revision, grades as goals, marks for participation, etc) .
  • Students should be able to choose how, when and on what they are assessed.
  • Where possible, students should participate in the assessment of themselves and others.
  • Assessment should help the teacher to understand the needs, interests, skills, and gaps in knowledge of their students, and should be used to help to improve teaching.
  • Assessment is a way to show learners that we care about their learning.

Specific principles for summative assessments

A secondary (and always secondary) purpose of assessment is to provide evidence for credentials. This is normally described as summative assessment, implying that it assesses a state of accomplishment when learning has ended. That is a completely ridiculous idea. Learning doesn’t end. Human learning is not in any meaningful way like programming a computer or storing stuff in a database. Knowledge and skills are active, ever-transforming, forever actively renewed, reframed, modified, and extended. They are things we do, not things we have.

With that in mind, here are my principles for assessment for credentials (none of which supersede or override any of the above core principles for assessment, which always apply):

  • There should be no assessment task that is not in itself a positive learning activity. Anything else is at best inefficient, at worst punitive/extrinsically rewarding.
  • Assessment for credentials must be fairly applied to all students.
  • Credentials should never be based on comparisons between students (norm-referenced assessment is always, unequivocally, and unredeemably wrong).
  • The criteria for achieving a credential should be clear to the learner and other interested parties (such as employers or other institutions), ideally before it happens, though this should not forestall the achievement and consideration of other valuable outcomes.
  • There is no such thing as failure, only unfinished learning. Credentials should only celebrate success, not punish current inability to succeed.
  • Students should be able to choose when they are ready to be assessed, and should be able to keep trying until they succeed.
  • Credentials should be based on evidence of competence and nothing else.
  • It should be impossible to compromise an assessment by revealing either the assessment or solutions to it.
  • There should be at least two ways to demonstrate competence, ideally more. Students should only have to prove it once (though may do so in many ways and many times, if they wish).
  • More than one person should be involved in judging competence (at least as an option, and/or on a regularly taken sample).
  • Students should have at least some say in how, when, and where they are assessed.
  • Where possible (accepting potential issues with professional accreditation, credit transfer, etc) they should have some say over the competencies that are assessed, in weighting and/or outcome.
  • Grades and marks should be avoided except where mandated elsewhere. Even then, all passes should be treated as an ‘A’ because students should be able to keep trying until they excel.
  • Great success may sometimes be worthy of an award – e.g. a distinction – but such an award should never be treated as a reward.
  • Assessment for credentials should demonstrate the ability to apply learning in an authentic context. There may be many such contexts.
  • Ideally, assessment for credentials should be decoupled from the main teaching process, because of risks of bias, the potential issues of teaching to the test (regardless of individual needs, interests and capabilities) and the dangers to motivation of the assessment crowding out the learning. However, these risks are much lower if all the above principles are taken on board.

I have most likely missed a few important issues, and there is a bit of redundancy in all this, but this is a work in progress. I think it covers the main points.

Further random reflections

There are some overriding principles and implied specifics in all of this. For instance, respect for diversity, accessibility, respect for individuals, and recognition of student control all fall out of or underpin these principles. It implies that we should recognize success, even when it is not the success we expected, so outcome harvesting makes far more sense than measurement of planned outcomes. It implies that failure should only ever be seen as unfinished learning, not as a summative judgment of terminal competence, so appreciative inquiry is far better than negative critique. It implies flexibility in all aspects of the activity. It implies, above and beyond any other purpose, that the focus should always be on learning. If assessment for credentials adversely affects learning then it should be changed at once.

In terms of implementation, while objective quizzes and their cousins can play a useful formative role in helping students to self-assess and to build confidence, machines (whether implemented by computers or rule-following humans) should normally be kept out of credentialling. There’s a place for AI but only when it augments and informs human intelligence, never when it behaves autonomously. Written exams and their ilk should be avoided, unless they conform to or do not conflict with all the above principles: I have found very few examples like this in the real world, though some practical demonstrations of competence in an authentic setting (e.g. lab work and reporting) and some reflective exercises on prior work can be effective.

A portfolio of evidence, including a reflective commentary, is usually going to be the backbone of any fair, humane, effective assessment: something that lets students highlight successes (whether planned or not), that helps them to consolidate what they have learned, and that is flexible enough to demonstrate competence shown in any number of ways. Outputs or observations of authentic activities are going to be important contributors to that. My personal preference in summative assessments is to only use the intended (including student-generated) and/or harvested outcomes for judging success, not for mandated assignments. This gives flexibility, it works for every subject, and it provides unquivocal and precise evidence of success. It’s also often good to talk with students, perhaps formally (e.g. a presentation or oral exam), in order to tease out what they really know and to give instant feedback. It is worth noting that, unlike written exams and their ilk, such methods are actually fun for all concerned, albeit that the pleasure comes from solving problems and overcoming challenges, so it is seldom easy.

Interestingly, there are occasions in traditional academia where these principles are, for the most part, already widely applied. A typical doctoral thesis/dissertation, for example, is often quite close to it (especially in more modern professional forms that put more emphasis on recording the process), as are some student projects. We know that such things are a really good idea, and lead to far richer, more persistent, more fulfilling learning for everyone. We do not do them ubiquitously for reasons of cost and time. It does take a long time to assess something like this well, and it can take more time during the rest of the teaching process thanks to the personalization (real personalization, not the teacher-imposed form popularized by learning analytics aficionados) and extra care that it implies. It is an efficient use of our time, though, because of its active contribution to learning, unlike a great many traditional assessment methods like teacher-set assignments (minimal contribution) and exams (negative contribution).  A lot of the reason for our reticence, though, is the typical university’s schedule and class timetabling, which makes everything pile on at once in an intolerable avalanche of submissions. If we really take autonomy and flexibility on board, it doesn’t have to be that way. If students submit work when it is ready to be submitted, if they are not all working in lock-step, and if it is a work of love rather than compliance, then assessment is often a positively pleasurable task and is naturally staggered. Yes, it probably costs a bit more time in the end (though there are plenty of ways to mitigate that, from peer groups to pedagogical design) but every part of it is dedicated to learning, and the results are much better for everyone.

Some useful further reading

This is a fairly random selection of sources that relate to the principles above in one way or another. I have definitely missed a lot. Sorry for any missing URLs or paywalled articles: you may be able to find downloadable online versions somewhere.

Boud, D., & Falchikov, N. (2006). Aligning assessment with long-term learning. Assessment & Evaluation in Higher Education, 31(4), 399-413. Retrieved from

Boud, D. (2007). Reframing assessment as if learning were important. Retrieved from

Cooperrider, D. L., & Srivastva, S. (1987). Appreciative inquiry in organizational life. Research in organizational change and development, 1, 129-169.

Deci, E. L., Vallerand, R. J., Pelletier, L. G., & Ryan, R. M. (1991). Motivation and education: The self-determination perspective. Educational Psychologist, 26(3/4), 325-346.

Hussey, T., & Smith, P. (2002). The trouble with learning outcomes. Active Learning in Higher Education, 3(3), 220-233.

Kohn, A. (1999). Punished by rewards: The trouble with gold stars, incentive plans, A’s, praise, and other bribes (Kindle ed.). Mariner Books. (this one is worth forking out money for).

Kohn, A. (2011). The case against grades. Educational Leadership, 69(3), 28-33.

Kohn, A. (2015). Four Reasons to Worry About “Personalized Learning”. Retrieved from (check out Alfie Kohn’s whole site for plentiful other papers and articles – consistently excellent).

Reeve, J. (2002). Self-determination theory applied to educational settings. In E. L. Deci & R. M. Ryan (Eds.), Handbook of Self-Determination research (pp. 183-203). Rochester, NY: The University of Rochester Press.

Ryan, R. M., & Deci, E. L. (2017). Self-determination theory: Basic psychological needs in motivation, development, and wellness. Guilford Publications. (may be worth paying for if such things interest you).

Wilson-Grau, R., & Britt, H. (2012). Outcome harvesting. Cairo: Ford Foundation.

Smart learning – a new approach or simply a new name? | Smart Learning

Kinshuk has begun a blog on smart learning and, in this post, defines what that means. I particularly like:

 I have come to realize that while technology can help us in improving learning, a fundamental change is needed in the overall perception of educators and learners to see any real effect. Simply trying to create adaptive systems, intelligent systems, or any sort of mobile/ubiquitous environments is going to have only superficial impact, if we do not change the way we teach, and more importantly, the way we think of learning process (and assessment process). 

This very much echoes my own view. At least that fundamental change is needed in the context of formal education. Outside our ivory towers that fundamental change has already happened and continues to accelerate. Google Search, Wikipedia, Twitter, Reddit, StackExchange, Facebook and countless others of their net-enabled ilk are amongst the most successful learning technologies (more accurately, components of learning technologies) ever created, arguably up there with language and writing, ultimately way beyond printing or schools. 

Kinshuk goes on to talk of an ecosystem of technology and pedagogy, which I think is a useful way of looking at it. Terry Anderson, too, talks of the dance between technology and pedagogy with much the same intent. I agree that we have to take a total systems view of this. My own take on it is that pedagogies are technologies – learning technologies are simply those with pedagogies in the assembly, whether human-instantiated or embedded in tools. Technologies and pedagogies are not separate categories. Within the ecosystem there are many other technologies involved in the assembly apart from those we traditionally label as ‘learning technologies’ such as timetables, organizational structures, regulations, departmental roles, accreditation frameworks, curricula, organizational methods, processes and rituals, not to mention pieces like routers, protocols, software programs and whiteboards. But, though important, technologies are not the only objects in this ecology. We need to think of the entire ecosystem and consider things that are not technologies at all like friendship, caring, learning, creativity, belief, environment, ethics, and, of course, people. As soon as you get past the ‘if intervention x, then result y’ mindset that plagues much learning technology (and education) research, and start to see it as a complex adaptive system that is ultimately about what it means to be human, you enter a world of rich complexity that I think is far more productive territory. Its an ecosystem that is filled not just with process but with meaning and value. 

On a more mundane and pragmatic note, I think it is worth observing that learning and accreditation of competence must be entirely separated – accreditation is an invasive parasite in this ecosystem that feeds on and consumes learning. Or maybe it is more like the effluent that poisons it. Either way, I’d prefer that accreditation should not be lumped under the ‘smart learning’ banner at all. ‘Smart accreditation’ is fine – I have no particular concerns about that, as a separate field of study. In some ways it is worthy of study in smart learning because of its effects. That is somewhat along the lines of studying oil spills when considering natural ecosystems.  Assessment (feedback, critical reflection, judgement, etc), on the other hand, is a totally different matter. Assessment is a critical part of almost any pedagogy worthy of the name and so of course must be part of a smart learning ecology. I’m not sure that it warrants a separate category of its own but it is certainly important. It is, however, highly dangerous to take the ‘easy’ next step of using it to assert competence, especially when that assertion becomes the reason for learning in the first place, or is used as a tool to manipulate learners. That is what predominantly drives education now, to the point that it threatens the entire ecosystem. 

That said, I’d like to think that it is possible that the paths of accreditation and assessment might one day rejoin because they do share copious commonalities. It would be great to find ways that the smart stuff we are doing to support learning might, as a byproduct, also be useful evidence in accreditation, without clogging up the whole ecosystem. Technologies like Caliper, TinCan, and portfolios offer much promise for that. 

Address of the bookmark: