Our educational assessment systems are designed to create losers

The always wonderful Alfie Kohn describes an airline survey that sought to find out how it compared with others, which he chose not to answer because the airline was thus signalling no interest in providing the best quality experience possible, just aiming to do enough to beat the competition. The thrust of his article is that much the same is true of standardized tests in schools. As Kohn rightly observes, the central purpose of testing as it tends to be used in schools and beyond is not to evaluate successful learning but to compare students (or teachers, or institutions, or regions) with one another in order to identify winners and losers.

‘When you think about it, all standardized tests — not just those that are norm-referenced — are based on this compulsion to compare. If we were interested in educational excellence, we could use authentic forms of assessment that are based on students’ performance at a variety of classroom projects over time. The only reason to standardize the process, to give all kids the same questions under the same conditions on a contrived, one-shot, high-stakes test, is if what we wanted to know wasn’t “How well are they learning?” but “Who’s beating whom?”

It’s a good point, but I think it is not just an issue with standardized tests. The problem occurs with all the summative assessments (the judgments) we use. Our educational assessment systems are designed to create losers as much as they a made to find winners. Whether they follow the heinous practice of norm-referencing or not, they are sorting machines, built to discover competent people, and to discard the incompetent. In fact, as Kohn notes, when there are too many winners we are accused of grade inflation or a dropping of standards.

Wrong Way sign This makes no sense if you believe, as I do, that the purpose of education is to educate. In a system that demands grading, unless 100% of students that want to succeed get the best possible grades, then we have failed to meet the grade ourselves. The problem, though, is not so much the judgments themselves as it is the intimate, inextricable binding of judgmental with learning processes. Given enough time, effort, and effective teaching, almost anyone can achieve pretty much any skill or competence, as long as they stick at it. We have very deliberately built a system that does not aim for that at all. Instead, it aims to sort wheat from chaff. That’s not why I do the job I do, and I hope it is not why you do it either, but that’s exactly what the system is made to do. And yet we (at least I) think of ourselves as educators, not judges. These two roles are utterly separate and inconsolably inconsistent.

Who needs 100%?

It might be argued that some students don’t actually want to get the best possible grades. True. And sure, we don’t always want or need to learn everything we could learn. If I am learning how to use a new device or musical instrument I sometimes read/watch enough to get me started and do not go any further, or skim through to get the general gist. Going for a less-than-perfect understanding is absolutely fine if that’s all you need right now. But that’s not quite how it works in formal education, in part because we punish those that make such choices (by giving lower grades) and in part because we systematically force students to learn stuff they neither want nor need to learn, at a time that we choose, using the lure of the big prizes at the end to coax them. Even those that actually do want or need to learn a topic must stick with it to the bitter end regardless of whether it is useful to do the whole thing, regardless of whether they need more or less of it, regardless of whether it is the right time to learn it, regardless of whether it is the right way for them to learn it. They must do all that we say they must do, or we won’t give them the gold star. That’s not even a good way to train a dog.

It gets worse. At least dogs normally get a second chance. Having set the bar, we normally give just a single chance at winning or, at best, an option to be re-tested (often at a price and usually only once), rather than doing the human thing of allowing people to take the time they need and learn from their mistakes until they get as good as they want or need to get. We could learn a thing or two from computer games –  the ability to repeat over and over, achieving small wins all along the way without huge penalties for losing, is a powerful way to gain competence and sustain motivation. It is better if students have some control over the pacing but, even at Athabasca, an aggressively open university that does its best to give everyone all the opportunity they need to succeed, where self-paced learners can choose the point at which they are ready to take the assessments, we still have strict cut-offs for contract periods and, like all the rest, we still tend to allow just a single stab at each assessment. In most of my own self-paced courses (and in some others) we try to soften that by allowing students to iterate without penalty until the end but, when that end comes, that’s still it. This is not for the benefit of the students: this is for our convenience. Yes, there is a cost to giving greater freedom – it takes time, effort, and compassion – but that’s a business problem to solve, not an insuperable barrier. WGU’s subscription model, for instance, in which students pay for an all-you-can-eat smorgasbord, appears to work pretty well.

Meta lessons

It might be argued that there are other important lessons that we teach when we competitively grade. Some might suggest that competition is a good thing to learn in and of itself, because it is one of the things that drives society and everyone has to do it at least sometimes. Sure, but cooperation and mutual support is usually better, or at least an essential counterpart, so embedding competition as the one and only modality seems a bit limiting. And, if we are serious about teaching people about how to compete, then that is what we should do, and not actively put them in jeopardy to achieve that: as Jerome Bruner succinctly put it, ‘Learning something with the aid of an instructor should, if instruction is effective, be less dangerous or risky or painful than learning on one’s own’ (Bruner 1966, p.44).

Others might claim that sticking with something you don’t like doing is a necessary lesson if people are to play a suitably humble/productive role in society. Such lessons have a place, I kind-of agree. Just not a central place, just not a pervasive place that underpins or, worse, displaces everything else. Yes, grit can be really useful, if you are pursuing your goals or helping others to reach theirs. By all means, let’s teach that, let’s nurture that, and by all means let’s do what we can to help students see how learning something we are teaching can help them to reach their goals, even though it might be difficult or unpleasant right now. But there’s a big difference between doing something for self or others, and subservient compliance with someone else’s demands. ‘Grit’ does not have to be synonymous with ‘taking orders’. Doing something distasteful because we feel we must, because it aligns with our sense of self-worth, because it will help those we care about, because it will lead us where we want to be, is all good. Doing something because someone else is making us do it (with the threat/reward of grades) might turn us into good soldiers, might generate a subservient workforce in a factory or coal face, might keep an unruly subjugated populace in check, but it’s not the kind of attitude that is going to be helpful if we want to nurture creative, caring, useful members of 21st Century society.

Societal roles

It might be argued that accreditation serves a powerful societal function, ranking and categorizing people in ways that (at least for the winners and for consumers of graduates) have some value. It’s a broken and heartless system, but our societies do tend to be organized around it and it would be quite disruptive if we got rid of it without finding some replacement. Without it, employers might actually need to look at evidence of what people have done, for instance, rather than speedily weeding out those with insufficient grades. Moreover, circularly enough, most of our students currently want and expect it because it’s how things are done in our culture. Even I, a critic of the system, proudly wear the label ‘Doctor’, because it confers status and signals particular kinds of achievement, and there is no doubt that it and other qualifications have been really quite useful in my career. If that were all accreditation did then I could quite happily live with it, even though the fact that I spent a few years researching something interesting about 15 years ago probably has relatively little bearing on what I do or can do now.  The problem is not accreditation in itself, but that it is inextricably bound to the learning process. Under such conditions, educational assessment systems are positively harmful to learning. They are anti-educative. Of necessity, due to the fact that they tend to determine precisely what students should do and how they should do it, they sap intrinsic motivation and undermine love of learning. Even the staunchest of defenders of tightly integrated learning and judgment would presumably accept that learning is at least as important as grading so, if grading undermines learning (and it quite unequivocally does), something is badly broken.

A simple solution?

It does not have to be this way. I’ve said it before but it bears repeating: at least a large part of the solution is to decouple learning and accreditation altogether. There is a need for some means to indicate prowess, sure. But the crude certificates we currently use may not be the best way to do that in all cases, and it doesn’t have to dominate the learning process to the point of killing love of learning. If we could drop the accreditation role during the teaching process we could focus much more on providing useful feedback, on valorizing failures as useful steps towards success, on making interesting diversions, on tailoring the learning experience to the learner’s interests and capabilities rather than to credential requirements, on providing learning experiences that are long enough and detailed enough for the students’ needs, rather than a uniform set of fixed lengths to suit our bureaucracies.

Equally, we could improve our ability to provide credentials. For those that need it, we could still offer plenty of accreditation opportunities, for example through a portfolio-based approach and/or collecting records of learning or badges along the way. We could even allow for some kind of testing like oral, written, or practical exams for those that must, where it is appropriate to the competence (not, as now, as a matter of course) and we could actually do it right, rather than in ways that positively enable and reward cheating. None of this has to bound to specific courses. This decoupling would also give students the freedom to choose other ways of learning apart from our own courses, which would be quite a strong incentive for us to concentrate on teaching well. It might challenge us to come up with authentic forms of assessment that allow students to demonstrate competence through practice, or to use evidence from multiple sources, or to show their particular and unique skillset. It would almost certainly let us do both accreditation and teaching better. And it’s not as though we have no models to work from: from driving tests to diving tests to uses of portfolios in job interviews, there are plenty of examples of ways this can work already.

Apart from some increased complexities of managing such a system (which is where online tools can come in handy and where opportunities exist for online institutions that conventional face-to-face institutions cannot compete with) this is not a million miles removed from what we do now: it doesn’t require a revolution, just a simple shift in emphasis, and a separation of two unnecessarily and mutually inconsistent intertwined roles. Especially when processes and tools already exist for that, as they do at Athabasca University, it would not even be particularly costly. Inertia would be a bigger problem than anything else, but even big ships can eventually be steered in other directions. We just have to choose to make it so.

 

Reference

Bruner, J. S. (1966). Toward a Theory of Instruction. Cambridge MA: The Belknap Press of Harvard University Press.

Every attempt to manage academia makes it worse

Excellent post from Mike Taylor on the inevitable consequences of the use of incentives to shape a system (in this case, an educational system). As Mike notes, the problem is well-known and well understood, yet  otherwise intelligent people continue to rely on extrinsic incentives to attempt to shape behaviour. It’s a classic Monkey’s Paw problem – you get what you wish for but something very bad will inevitably happen, often worse than the problem you are trying to solve. We can make people do things with extrinsic incentives (reward and punishment), sure, but in doing so we change the focus from what we want to achieve to the reward itself, which invariably destroys intrinsic motivation to do what we want done, reinforces our power (and thus the weakness of those we ‘incentivize’), and ultimately backfires on us in tragically predictable ways, because what we actually want done is almost never the thing we choose to measure.

some consequences of incentives, Edwards and Roy (2017)

Our educational systems (and many others) are built around extrinsic incentives, from grades through to performance-related pay through to misguided research assessment exercises, evaluations based on publication records, etc. The consequences are uniformly dire.

Mike quotes Tim Harford (from http://timharford.com/2016/09/4035/) as providing what seems to me to be the only sensible solution:

“The basic principle for any incentive scheme is this: can you measure everything that matters? If you can’t, then high-powered financial incentives will simply produce short-sightedness, narrow-mindedness or outright fraud. If a job is complex, multifaceted and involves subtle trade-offs, the best approach is to hire good people, pay them the going rate and tell them to do the job to the best of their ability.”

Well said. Except that I would add that the effects on motivation of any incentive scheme are always awful, and that’s the biggest reason not to do it. It’s not just that it doesn’t achieve the results we hope for: it’s that it is unkind and dehumanizing. With that in mind, I wouldn’t tell them to do the job to the best of their ability. I might ask them. I might help to structure a system so that they and everyone else can see the positive and negative consequences of actions they take. I might try to nurture a community where people value one another and are mutually supportive. I might talk to them about what they are doing and offer my support in helping them to do it better. I might try to structure the system around what people want to do rather than trying to make them fit in the system I want to build. At least, that’s what I would do on a good day. On a bad day, under pressure from multiple quarters, overworked and overstressed, I might fall back on a three line whip or a plea to do their bit. I might make trades (‘do this and I will take away that’) or appeal to a higher authority (‘the Dean says we must…’) or to my own authority (‘this has to be done and you are the best one to do it..’), or to duty (‘it is in our contract that we have to do performance assessments…’).  And that’s where the problems begin.

Mike recommends Tim Harford’s ‘The Undercover Economist’ as a way out of this loop. I will read this, as I have read many books offering similar insights. It seems at first glance to fit very well with the findings of self-determination theory as well as behavioural economics. However, though the causes described here are the result of a failure to understand human motivation, this is, at heart, a systems problem of a broader nature: I recommend The Systems Bible (formerly Systemantics) by John Gall Systemantics by John Gall (formerly the Systems Bible) for a comprehensive set of explanations of the kinds of phenomena that give rise to stupid behaviour by groups of intelligent people. The book is deliberately funny, but the underlying theory on which it is based is extremely sound.

Address of the bookmark: https://svpow.com/2017/03/17/every-attempt-to-manage-academia-makes-it-worse/

How Do You Motivate Kids To Stop Skipping School?

Not like this!

This article starts with the line ‘it seems like a no-brainer’  and indeed it is. The no-brainer solution to low attendance is to make the schools relevant, meaningful and interesting to the kids.

However, bizarrely, that is not what seemed obvious to the writer of the article, nor to the ones that carried out this harmful and doomed research, who thought the obvious answer was an incentive scheme, and inflicted it on 799 kids, mostly age 9. Basically, they told the kids they would get two pencils and a cute eraser if they turned up 85% of the time during the 38-day study.

It seems that they did not bother with a literature review because, had they done so, they would have found out right away that rewards are totally the opposite of what is needed to motivate kids to attend school. There is over 50 years of compelling evidence from research on motivation, in many fields and from many disciplines, that demonstrates this unequivocally and beyond any reasonable doubt. The only possible consequence of this intervention would be to demotivate the kids so that, at best, they might revert to former behaviours at the end of it, and that many would be even less likely to attend when it was over.

Unsurprisingly, this is exactly what they found. The reward program did indeed increase attendance while it was in effect  (this is the allure of behaviourism and why it still holds sway – it does achieve immediate results) and, when it was over, kids were indeed even less motivated to attend than they had been before, exactly as theory and empirical research predicts. In fact, many of the kids got off very lightly: formerly high attenders and those that were not great attenders before but that succeeded in getting the reward only fell back to baseline levels as soon as it was over, which is actually pretty good going. A more significant reward or longer study period might have had worse consequences. Unfortunately, the effects on the ones that were the real target (those who were initially low attenders, 60% of whom failed to meet the goal) was disastrous: once the intervention was over, these already at-risk kids were only a quarter as likely to attend as they had been before the intervention began.

One of the surprised researchers said:

“”I almost felt badly about what we had done,” she says. “That in the end, we should not have done this reward program at all.”

Almost? Seriously. This borders on child abuse. I generally think of research ethics boards as an arguably necessary evil but, when I hear that experiments like this are still going on, I could easily become a fan.

Address of the bookmark: http://www.npr.org/sections/goatsandsoda/2015/05/22/407947554/how-do-you-motivate-kids-to-stop-skipping-school