Educational ends and means: McNamara’s Fallacy and the coming robot apocalypse (presentation for TAMK)

 

These are the slides that I used for my talk with a delightful group of educational leadership students from TAMK University of Applied Sciences in Tampere, Finland at (for me) a somewhat ungodly hour Wednesday night/Thursday morning after a long day. If you were in attendance, sorry for any bleariness on my part. If not, or if you just want to re-live the moment, here is the video of the session (thanks Mark!)man shaking hands with a robot

The brief that I was given was to talk about what generative AI means for education and, if you have been following any of my reflections on this topic then you’ll already have a pretty good idea of what kinds of issues I raised about that. My real agenda, though, was not so much to talk about generative AI as to reflect on the nature and roles of education and educational systems because, like all technologies, the technology that matters in any given situation is the enacted whole rather than any of its assembled parts. My concerns about uses of generative AI in education are not due to inherent issues with generative AIs (plentiful though those may be) but to inherent issues with educational systems that come to the fore when you mash the two together at a grand scale.

The crux of this argument is that, as long as we think of the central purposes of education as being the attainment of measurable learning outcomes or the achievement of credentials, especially if the focus is on training people for a hypothetical workplace, the long-term societal effects of inserting generative AIs into the teaching process are likely to be dystopian. That’s where Robert McNamara comes into the picture. The McNamara Fallacy is what happens when you pick an aspect of a system to measure, usually because it is easy, and then you use that measure to define success, choosing to ignore or to treat as irrelevant anything that cannot be measured. It gets its name from Robert McNamara, US Secretary of Defense during the Vietnam war, who famously measured who was winning by body count, which is probably among the main reasons that the US lost the war.

My concern is that measurable learning outcomes (and still less the credentials that signify having achieved them) are not the ends that matter most. They are, more, means to achieve far more complex, situated, personal and social ends that lead to happy, safe, productive societies and richer lives for those within them. While it does play an important role in developing skills and knowledge, education is thus more fundamentally concerned with developing values, attitudes, ways of thinking, ways of seeing, ways of relating to others, ways of understanding and knowing what matters to ourselves and others, and finding how we fit into the social, cultural, technological, and physical worlds that we inhabit. These critical social, cultural, technological, and personal roles have always been implicit in our educational systems but, at least in in-person institutions, it seldom needs to be made explicit because it is inherent in the structures and processes that have evolved over many centuries to meet this need. This is why naive attempts to simply replicate the in-person learning experience online usually fail: they replicate the intentional teaching activities but neglect to cater for the vast amounts of learning that occur simply due to being in a space with other people, and all that emerges as a result of that. It is for much the same reasons that simply inserting generative AI into existing educational structures and systems is so dangerous.

If we choose to measure the success or failure of an educational system by the extent to which learners achieve explicit learning outcomes and credentials, then the case for using generative AIs to teach is extremely compelling. Already, they are far more knowledgeable, far more patient, far more objective, far better able to adapt their teaching to support individual student learning, and far, far cheaper than human teachers. They will get better. Much better. As long as we focus only on the easily measurable outcomes and the extrinsic targets, simple economics combined with their measurably greater effectiveness means that generative AIs will increasingly replace teachers in the majority of teaching roles.  That would not be so bad – as Arthur C. Clarke observed, any teacher that can be replaced by a machine should be – were it not for all the other more important roles that education plays, and that it will continue to play, except that now we will be learning those ways of being human from things that are not human and that, in more or less subtle ways, do not behave like humans. If this occurs at scale – as it is bound to do – the consequences for future generations may not be great. And, for the most part, the AIs will be better able to achieve those learning outcomes themselves – what is distinctive about them is that they are, like us, tool users, not simply tools – so why bother teaching fallible, inconsistent, unreliable humans to achieve them? In fact, why bother with humans at all? There are, almost certainly, already large numbers of instances in which at least part of the teaching process is generated by an AI and where generative AIs are used by students to create work that is assessed by AIs.

It doesn’t have to be this way. We can choose to recognize the more important roles of our educational systems and redesign them accordingly, as many educational thinkers have been recommending for considerably more than a century. I provide a few thoughts on that in the last few slides that are far from revolutionary but that’s really the point: we don’t need much novel thinking about how to accommodate generative AI into our existing systems. We just need to make those systems work the way we have known they should work for a very long time.

Download the slides | Watch the video

Stories that matter and stories that don’t: some thoughts on appropriate teaching roles for generative AIs

robot reading a bedtime story to a child Well, this was definitely going to happen.

The system discussed in this Wired article is a bot (not available to the general public) that takes characters from the absurdly popular Bluey cartoon series and creates personalized bedtime stories involving them for its creator’s children using ChatGPT+. This is something anyone could do – it doesn’t take a prompt-wizard or specialized bot to do this. You could easily make any reasonably proficient LLM incorporate your child’s interests, friends, family, and characteristics and churn out a decent enough story from it. With copyright-free material you could make the writing style and scenes very similar to the original. A little editorial control may be needed here and there but I think that, with a smart enough prompt, it would do a fairly good, average sort of a job, at least as readable as what an average human might produce, in a fraction of the time. I find this to be hugely problematic, though, and not for the reasons given in the article, though there are certainly some legal and ethical concerns, especially around copyright and privacy as well as the potential for generating dubious, disturbing, or otherwise poor content.

Why stories matter

The thing that bothers me most about this is not the quality of the stories but the quality of the relationship between the author and the reader (or listener).  Stories are the most human of artifacts, the ways that we create and express meaning, no matter how banal. They act as hooks that bind us together, whether invented by a parent or shared across whole cultures. They are a big part of how we learn and establish our relationships with the world and with one another. They are glimpses into how another person thinks and feels: they teach us what it means to be human, in all its rich diversity. They reflect the best and the worst of us, and they teach us about what matters.

My children were in part formed by the stories I made up or read to them 30 or more years ago, and it matters that none were made by machines. The language that I used, the ways that I wove in people and things that were meaningful to them, the attitudes I expressed, the love that went into them, all mattered.  I wish I’d recorded one or two, or jotted down the plots of at least some of the very many Lemmie the Suicidal Lemming stories that were a particular favourite. These were not as dark as they sound – Lemmie was a cheerful creature who just happened to be prone to putting himself in life-threatening situations, usually as a result of following others. Now that they have children of their own, both my kids have deliciously dark but fundamentally compassionate senses of humour and a fierce independence that I’d like to think may, in small part, be a result of such tales.

The books I (or, as they grew, we, and then they) chose probably mattered more. Some had been read to me by my own parents and at least a couple were read to them by their own parents. Like my children, I learned to read very young, largely because my imagination was fired by those stories, and fired by how much they mattered to my parents and siblings. As much as the people around me, the people who wrote and inhabited the books I listened to and later read made me who I am, and taught me much of what I still know today – not just facts to recall in a pub quiz but ways of thinking and understanding the world, and not just because of the values they shared but because of my responses to them, that increasingly challenged those values. Unlike AI-generated tales, these were shared cultural artifacts, read by vast numbers of people, creating a shared cultural context, values, and meanings that helped to sustain and unite the society I lived in. You may not have read many of the same books I read as a middle class boy growing up in 1960s Britain but, even if you are not of my generation or cultural background, you might have read (or seen video adaptations of) one or more children’s works by A.A. Milne, Enid Blyton, C.S. Lewis, J.R.R.Tolkein, Hans Christian Anderson, Charles Dickens, Lewis Caroll, Kenneth Grahame, Rev. W. Awdry, T.S. Eliot, the Brothers Grimm, Norton Juster, Edward Lear, Hugh Lofting, Dr. Seuss, and so on. That matters, and it matters that I can still name them. These were real authors with attitudes, beliefs, ideas, and styles unlike any other. They were products and producers of the times and places they lived in. Many of their attitudes and values are, looking back, troublesome, and that was true even then. So many racist and sexist stereotypes and assumptions, so many false beliefs, so many values and attitudes that had no place in the 1960s, let alone now. And that was good, because it introduced me to a diversity of ways of being and thinking, and allowed me to compare them with my own values and those of other authors, and it prepared me for changes to come because I had noticed the differences between their context and mine, and questioned the reasons.

With careful prompting, generative AIs are already capable of producing work of similar quality and originality to fan fiction or corporate franchise output around the characters and themes of these and many other creative works, and maybe there is a place for that. It couldn’t be much worse than (say) the welter of appallingly sickly, anodyne, Americanized, cookie-cutter, committee-written Thomas the Tank Engine stories that my grandchildren get to watch and read, that bear as little resemblance to Rev. W. Awdry’s sublimely stuffy Railway Stories as Star Wars. It would soften the sting when kids reach the end of a much loved series, perhaps. And, while it is a novelty, a personalized story might be very appealing, albeit that there is something rather distasteful about making a child feel special with the unconscious output of a machine to which nothing matters. But this is not just about value to individuals, living with the histories and habits we have acquired in pre-AI times. This is something that is happening at a ubiquitous and massive scale, everywhere. When this is no longer a novelty but the norm it will change us, and change our societies, in ways that make me shiver. I fear that mass-individualization will in fact be mass-blandification, a myriad of pale shadows that neither challenge nor offend, that shut down rather than open up debate, that reinforce norms that never change and are never challenged (because who else will have read them?), that look back rather than forward, that teach us average ways of thinking, that learn what we like and enclose us in our own private filter bubble, keeping us from evolving, that only surprise us when they go wrong. This is in the nature of generative AIs because all they have to learn from is our own deliberate outputs and, increasingly, the outputs of prior generative AIs, not from any kind of lived experience. They are averaging mirrors whose warped distortions can convince us they are true reflections. Introducing AI-generated stories to very young children, at scale, seems to me to be an awful gamble with very high stakes for their futures. We are performing uncontrolled experiments with stuff that forms minds, values, attitudes, expectations, and meanings that these kids will carry with them for the rest of their lives, and there is at least some reason to suspect that the harm may be greater than the good, both on an individual and a societal level. At the very least, there is a need for a large amount of editorial control, but how many parents of young children have the time or the energy for that?

That said…

Generating, not consuming output

I do see great value in working with and supporting the kids in creating the prompts for those stories themselves. While the technology is moving too fast for these evanescent skills to be describable as generative AI literacies, the techniques they learn and discoveries they make while doing so may help them to understand the strengths and limitations of the tools as they continue to develop, and the outputs will matter more because they contributed to creating them. Plus, it is a great fun way to learn. My nearly 7-year-old grandchild, with the help of their father, has enjoyed and learned a lot from creating images with DALL-E, for instance, and has been doing so long enough to see massive improvements in its capabilities, so has learned some great meta-lessons about the nature of technological evolution too. This has not stopped them from developing their own artistic skills, including with the help of iPads and AI-assisted drawing tools, which offer excellent points of comparison and affordances to reflect on the differences. It has given them critical insight into the nature of the output and the processes that led to it, and it has challenged them to bend the machine to do what they want it to do. This kind of mindful use of the tools as complementary partners, rather than consumption of their products, makes sense to me.

I think the lessons carry forward to adult learning, too. I have huge misgivings about giving generative AIs a didactic role, for the same reasons that having them tell stories to children worry me. However, they can be great teachers for those that make use of them to create output, rather than being targets of the output they have created. For instance I have been really enjoying using ChatGPT+ to help me write an Elgg plugin over the past few weeks, intended to deal with a couple of show-stopping bugs in an upgrade to the Landing that I had been struggling with for about 3 years, on and (mostly) off. I had come to see the problems as intractable, especially as a fair number of far smarter Elgg developers than I had looked at them and failed to see where the problems lay. ChatGPT+ let me try out a lot more ideas than even a large team of developers would have been able to come up with alone, and it took care of some of the mundane repetitive work that made the process slow.  Though none of it was bad, little of its code was particularly good: it made up stuff, omitted stuff, and did things inefficiently. It was really good, though, at putting in explanatory comments and documenting what it was doing. This was great, because the things I had to do to fix the flaws taught me a lot more than I would have learned had they been perfect solutions. Nearly always, it was good enough and well-documented enough to set me on the right path, but the ways it failed drove me to look at source documentation, query the underlying database (now knowing what to look for), follow conversations on GitHub, and examine human-created plugins, from which I learned a lot more and got further inspiration about what to ask the LLM to do next. Because it made different mistakes each time, it helped me to slowly develop a clearer model of how it should really have happened, so I got better and better at solving the problems myself, meanwhile learning a whole raft of useful tricks from the code that worked and at least as much from figuring out why it didn’t. It was very iterative: each attempt sparked ideas for the next attempt. It gave me just enough scaffolding to help me do what I could not do alone. About half way through I discovered the cause of the problem – a single changed word in the 150,000+ lines of code in the core engine, that was intended to better suit the new notification system, but that resulted in the existing 20m+ notification messages in the system failing to display correctly. This gave me ideas for some better prompts, the results of which taught me more. As a result, I am now a better Elgg coder than I was when I began, and I have a solution to a problem that has held up vital improvements to an ailing site used by more than 16,000 people for many years (though there are still a few hurdles to overcome before it reaches the production site).

Filling the right gaps

The final solution actually uses no code from ChatGPT+ at all, but it would not have been possible to get to that point without it. The skills it provided were different to and complementary to my own, and I think that is the critical point. To play an effective teaching role, a teacher has to leave the right kind of gaps for the learner to fill. If they are too large or too small, the learner learns little or nothing. The to and fro between me and the machine, and the ease with which I could try out different ideas, eventually led to those gaps being just the right size so that, instead of being an overwhelming problem, it became an achievable challenge. And that is the story that matters here.

The same is true of the stories that inspire: they leave the right sized gaps for the reader or listener to fill with their own imaginations while providing sufficient scaffolding to guide them, surprise them, or support them on the journey. We are participants in the stories, not passive recipients of them, much as I was a participant in the development of the Elgg plugin and, similarly, we learn through that participation. But there is a crucial difference. While I was learning the mechanical skills of coding from this process (as well as independently developing the soft skills to use them well), the listener to or reader of a story is learning the social, cultural, and emotional skills of being human (as well as, potentially, absorbing a few hard facts and the skills of telling their own stories). A story can be seen as a kind of machine in its own right: one that is designed to make us think and feel in ways that matter to the author. And that, in a nutshell, is why a story produced by a generative AI is such a problematic idea for the reader, but the use of a generative AI to help produce that story can be such a good idea for the writer.

Originally posted at: https://landing.athabascau.ca/bookmarks/view/21680600/stories-that-matter-and-stories-that-dont-some-thoughts-on-appropriate-teaching-roles-for-generative-ais