Learning centricity vs Learner centricity: some thoughts on Dave Cormier’s human-centred model for discussing AI in assessment

Posted June 12, 2026, updated onJune 12, 2026 by Jon Dron

I don’t think I’ve ever read any published article by Dave Cormier that I didn’t like, and I’ve read quite a few. This latest blog post, “A human-centred model for discussing AI in assessment” is no exception. In it, Dave describes his framework for discussing AI in assessment with teachers and other course designers, revolving around 3 questions:

Did they do it?
Have they learned?
Are we helping?

I love the simplicity: I like how the questions cut through to what matters with absolutely no jargon or equivocal wording. I really like the emphasis on learning and the lack of explicit AI focus. These are good questions for any credentialed work, not just in the context of AI. I think I’d be inclined to add one more question:

Have we learned?

because this is really the point of it all, and we should be designing our assessments so that we do learn, but maybe that is implicit in the context. There might be other questions to ask like “To what extent can we prove this?” or “Is it equitable?” or (my favourite) “Did it bring joy?” and so on, because there are bigger systems into which this needs to slot, but maybe they are for a different and more focused conversation.

However, though questions are great, the diagram puzzles me:

After poring over it for a while, I think the overlaps are meant to represent where the answer to the question is “yes” (so the sweet spot is where they did it, we helped, and they learned in the process) while the non-overlapping elements are those to which the answer is “no”, and the labels of the intersections are ways of describing the results of aligning them. If so, it’s an odd way of using Venn diagrams. The core problem is that the questions are not sets, so they cannot intersect and, to make it worse, they are questions about different kinds of entity that could never be overlapping sets:

The work submitted for assessment (artifacts).
What students learned (cognitive changes).
What we did to try to help them learn (interventions).

I like the idea of a Venn diagram for this, though, so I started to wonder what should really be in those sets, and what follows is the result.

A learning centred model for discussing AI and assessment

Rather than focusing on the whole learner, I think would make more sense were we to look at the questions from the perspective of the learning (latent or actual) that each question is asking about:

Intended learning: the knowledge, including subject and pedagogical knowledge and skills, both tacit and explicit, that the teacher (a distributed entity including designated teachers, the learner, other learners, textbook authors, institutional systems and policies, etc) intends to facilitate. Subsumes learning outcomes but also includes ways of thinking, learning, acting, etc.
Actual learning: the knowledge and skills gained in the process. This speaks not only to what is learned but also to how it is accommodated and integrated with existing knowledge and skills.
Learning exhibited in assessed work: some being the result of what the student did, perhaps some being from something/someone else, for more or less legitimate reasons.

In real life, this is all deeply intertwingled and learning cannot possibly fall so neatly into measurable sets. Any measurements are almost entirely notional. We might be able to get some clues about the knowledge it represents, but knowledge is not neatly quantifiable either. It can be extended, embodied, enacted, or embedded, and it’s a collective as well as an individual phenomenon but, even from an individual perspective, it’s not a thing you can count. It’s a thing you do as much as a thing you have. However, this is a model, not the reality that it models, and it is to support a conversation about design and performance, so we don’t need precision or anything like it. It just needs to be good enough to be able to talk about what we need to talk about.

Here’s my attempt at a quick and dirty diagram to represent this:

Dave’s questions can easily be overlaid on top of this in order to explore how it plays out in any particular course, as can quite a few others. It illustrates that some or all of the work submitted might not contribute to student learning, and that some or all of it might not be the result of our teaching. It correctly shows that we routinely assess work that students did not do, and that we only assess a sample of what was learned. It highlights the fact that a lot of learning happens that is neither taught nor assessed, that we teach things we do not assess, that teaching behaviours don’t always result in learning, and we assess things we do not teach. It represents the fact that learning can happen even when the student is not the creator of the work that is supposed to lead to it and/or represent it, which can be especially significant in a genAI context. It allows us to overlay plenty of other questions, including the extent to which the tasks we set and the time we allowed for doing them made it more likely that the work submitted was not that of the student.

Possibly wasted efforts?

I’ve labelled learning that is exhibited in the assessed work, or only in the teacher’s intentions, but not in the student, as possibly wasted effort. This is not necessarily a bad thing, as long as the learning that did occur was sufficient and sufficiently worthwhile. In some cases it is absolutely normal and acceptable: for example, if a student provides a website that uses frameworks or libraries, it would be incomplete without them, even though the knowledge they signify is not that of the student (depending on how you understand “knowledge”). It is also part of the process that it has to be a bit lossy. Learning needs breathing room for scaffolds, connections, and shuffling things around that don’t immediately fit. It’s not information: it’s a living, breathing, active thing.

All that said, on the whole, I think it would be better if more learning had occurred. Anything that doesn’t fall within the “actual learning” set is stuff that could have been but has not been learned. And, for credentialed assessment, it is quite important that we don’t wind up awarding credentials for things the students have not learned. All that said, the contents of these subsets can indicate failure: if a student uses a generative AI or hires someone to create the work, it may well be a complete waste of time. If a teacher attempts to teach something but the student does not learn it then, though there is a possibility that the student learned other useful things (like to avoid the teacher’s classes), it seems a little wasteful.

Definitely wasted efforts

One of the most interesting subsets is the intersection of work submitted for assessment that complies with the requirements, that we tried to teach, but that is not the result of learning, at least of the sort we seek. Unless it is flagged as such and done intentionally, I think of this as the zone of unreliable assessment. It is the “bad AI” and copypasta zone, but there are plenty of other ways it can occur, including when students formulaically do something that meets the outcomes without understanding why or how it works. Whatever the cause, if we award credentials for knowledge and skills that students lack, we are failing in our credentialing role. This is not a big issue if it is ungraded, of course: it might even be a good thing that assists further learning. It is not unheard of for unlearned stuff to be the only thing we assess, though, and that unequivocally sucks. It’s not just a missed goal but an own goal.

Undiscovered outcomes

Another really interesting subset is the zone of unmeasured outcomes: outcomes for which there is clear evidence in the work provided, but that don’t fit the rubric or intended outcomes of the teacher. I think there is great scope for outcome harvesting here. It is not unusual for the most important outcomes of a course to be unintended and unmeasured, but it is quite unusual for us to provide credentials for them. I think we should try to do so, if we are serious about being human-centred. And, if there are lots of untaught outcomes that consistently appear in the assessed work, it might imply a shortfall in our teaching or description of the course. It’s a lot of work to mine learning outcomes, but GenAIs could be helpful: if we provided them with the work, intended outcomes, and a rubric, then asked what other competences were demonstrated in the work, they might provide a sufficiently close approximation for us to make the judgment calls needed to help our students.

Unmeasured teaching

The zone of unmeasured teaching contains the learning outcomes that we intentionally taught and that were met but that we didn’t assess. I don’t see this as a particularly major issue, apart from the fact that, if we are giving credentials rather than just using assessment for learning purposes, it would be kinder to make it possible for this additional evidence to be considered. This is a good reason for using a portfolio approach to assessment. If we are not doing that, then this is the zone in which to look for alternative assessments or better constructive alignment.

Mysterious learning

The most interesting zone of all (to me) is the one I have labelled as “Here there be dragons” which, following the lead of ancient cartographers, is my way of describing the significant subset of student knowledge about which we know little or nothing, that occurs while students are learning what we are teaching, that is not reflected (directly) in the assessed work, but that is neither what we taught nor what we intended to teach. This is the frontier territory that few of us ever enter unless we are committed complexivists, but that I think we need to explore most of all. This is especially so in an age of generative AI, where our traditional proxies for learning are breaking all around us and learners have far more supports outside the institution than within it. Reflective portfolios can help reveal a little of it, as can approaches to assessment in which we ask for evidence (any evidence) rather than compliance with our demands. It’s another space where AIs that could observe students learning might help, if everyone involved happened to be OK with some huge privacy, security, and trust issues. Knowing more about this and celebrating its existence opens up the potential for more learner-controlled outcomes, for the teacher to also be the taught, and for moving more of the unmeasured outcomes into the sweet spot.

The sweet spot

The sweet spot – essentially the happy intersection between teaching, learning, and assessment – is where constructive alignment occurs, and it should normally be as large as possible. Its relative size is a good proxy for the effectiveness of teaching and assessment, especially if the zone of unreliable assessment is small. There will never and should never be a perfect overlap as long as the knowledge is worth knowing, and there is always knowledge embedded in artifacts that is only borrowed from its creators, but the more we can know of what and how the students know, the more they know of what we know, the more that the evidence of knowledge in what we assess overlaps with what students actually know, the better it will be for everyone. The two-way knowledge flow between learner and teacher (bearing in mind teachers include the students themselves) is particularly important for achieving that.

Is this model in any way usable?

I don’t know if this is any better for this purpose than Dave’s original model: probably not. It’s more cohesive as a Venn diagram but it is more opaque as a prompt for dialogue, and I’m not entirely sure my model captures quite the same phenomena. In caricature, Dave’s is about learners, while mine is about learning. In conjunction with his questions, though, I think it might help to reveal more about both.

Paper: “Redefining Educational Technology: A Critical Collaborative Inquiry”, now available via Open Praxis

Posted June 2, 2026 by Jon Dron

I am fifth of 26 authors, led by the remarkable Aras Bozkurt, of a new paper, “Redefining Educational Technology: A Critical Collaborative Inquiry,” published in Open Praxis this month. I find myself in the company of some extremely luminous researchers from around the world, old and young, famous and less famous, who worked together on this using a collective writing methodology. In a loosely Delphi-like process, Aras started by gathering some quite detailed answers to some fairly open questions around the topic, assembled them into a draft of the paper, and then let us rip on it for a month or two, before some intensive collaborative work resolving what appeared to be hundreds (but might have just been scores) of comments involving quite rich debate and contrary opinions. The fact that we were spread across many timezones meant that, for a few weeks, there were always plenty of new changes and comments to follow every morning when I woke up and each evening at the end of a working day, and it got to be fiendishly difficult but very rewarding to follow them all. Seeing the finished paper, I’m sad it is now fixed. I still see a few places that I would like to make some small changes, so it would be super cool to be able to open it up to the broader community for further development. Imagine what not just 26 but 100 or even 1000 authors could come up with: a truly timely and organically evolving definition of the field.

I’ve participated in a few of these collective writing projects over the past few years, including View of Speculative Futures on ChatGPT and Generative Artificial Intelligence (AI): A Collective Reflection from the Educational Landscape, The Manifesto for Teaching and Learning in a Time of Generative AI: A Critical Collective Stance to Better Navigate the Future, and Venturing into the Unknown: Critical Insights into Grey Areas and Pioneering Future Directions in Educational Generative AI Research. All (especially the first) have been cited at a far higher rate than any other papers I have ever been involved with, partly because they are all really good, timely and authoritative papers, partly due to the skill and reputation of those who have led them, and partly because of promotion and citation by the large number of well-known authors involved in writing them. They have some of the same strengths as meta-studies (generally among the most highly cited papers in any field), representing an assortment of views stemming from prior research, with the added benefit of the original researchers having to defend their cases to the rest, albeit with social factors intruding that can lead to group-think, more voluble participants getting more of the air time, and so on.

Though the positive factors remain much the same in this one, I’m not sure whether this new paper will achieve a similar impact. It will get cited because it is, arguably, among the best ever written on the evolution of the educational technology field itself. If you are working in a niche where you need a shorthand high-level abstract analysis of every major thing that has happened in the field for at least the last half century, as well as (in more detail) currently significant themes, then this is the paper for you. However, I’d be slightly surprised if the definition we came up with at the end of it will get a great deal of broader traction. I think Aras did a great job of editing it down to something most of us found agreeable but it was a Sisyphean task that was never going to lead to something that would delight everyone. Apart from anything else, though a vast improvement on some of our earlier drafts, the definition is still very long and unwieldy. The refinements in the final paragraph are more contentious because they speak to aspirations rather than describing the whole reality. With such a large and diverse group hacking at it, it was very hard to come up with something that we all agreed with but, rather than get rid of them, Aras moved the more aspirational and prescriptive parts of it to the end. My words are in it but it’s a committee definition, and definitely not one I would write myself. I think I’d currently go with something like “the organization of stuff for learning and accreditation” though educational systems are complex and concerned with far more than just learning and credentials, so I might refine it a little further.

Here is the definition we actually came up with, in all its slightly awkward glory:

Educational technology, as a field of inquiry and practice, encompasses the research, understanding, design, orchestration, and evaluation of entangled human-technological systems — spanning analog, digital, organizational, social, and agentic dimensions — through which learning and meaning-making are enabled, mediated, supported, and transformed.

The field brings together researchers, practitioners, educators, communities, and institutions in ongoing efforts to study and improve educational experiences across formal, non-formal, and informal contexts.

The field holds as a core commitment that its theory, research and practice should be ethically grounded and critically reflexive — attending to the societal implications of technological integration, with particular concern for equity and the distribution of agency among all participants in educational processes. These commitments describe what the field aspires to, not a guarantee of how all its practice is enacted.

One important limitation: despite the large number of people from across the world involved, we are not perfectly representative of the field as a whole. I don’t think we had anyone from the training or corporate learning industry; the authorship was skewed towards researchers from relatively developed countries; we didn’t have many edtech geeks; there was not a lot of K-12 focus; women only accounted for less than a third of the authors; we were a little light on the informal and non-formal aspects; and we shared some non-universal attitudes, notably in having above average positive feelings towards openness. I don’t think many (if any) of us are involved with AECT, and that matters because the AECT plays a large role as a primary source of earlier definitions we respond to. In particular, I and my fellow co-authors are all part of a large but not ubiquitous community that would recognize complexivist accounts (e.g. Connectivism, entanglement theory, rhizomatic learning, networks of practice) as the most significant pedagogical models of our time, while AECT, even in its most recent definition, has never formally acknowledged their existence. The AECT definition also seems remarkably quiet about AI, even in its most recent, post-ChatGPT incarnation. I don’t know whether this is because it members move in different and more conservative educational circles than us, or whether it is a deliberate policy decision to ignore anything less than 10 or 20 years old, but not knowing is exactly why it is troublesome. I think that this makes a compelling case that our paper should be read as at least a counterbalance to theirs, but that there’s a future paper yet to be written that brings the many branches of the field together. And that is the point: not to create a definition that stands forever, but to be part of the ongoing conversation about what we do and why we do it, to capture a snapshot of what we think we are doing, and to allow it to be challenged and developed further. I’m very pleased to have had the chance to be a part of this. In a very real sense we were walking the talk, enacting and engaging in the kind of learning that we think edtech should support and enable. Aras is a real star for making this happen.

Finally, a personal reflection. One of my own papers is cited flatteringly often in the paper, but not by me, which was quite a novel thing for me to have to deal with. It put me in a slightly awkward position of wanting to explain and sometimes to defend it, while really wishing to cite some of my more recent papers and a whole book that represent a more developed view, yet feeling slightly apologetic and (in the company of many more significant researchers) not at all confident about pushing my views even more than they were already being pushed. It was a weird mix of feeling privileged, feeling proud, knowing that (having written a whole book on it, not to mention a lot of other papers) I should think of myself as something of an expert, but feeling very much like an imposter in the midst of all these smart people, not worthy of the attention. In the end I did chip in some clarifications and expansions, and I argued a few cases (mainly where my own theory was in agreement with others) but I consciously self-censored and there are still a couple of places where the interpretation is not quite what I meant when I wrote it. The soft/hard distinction, in particular, makes me a bit uneasy because, though my own views are represented, a different and (I believe) fundamentally incoherent definition (the classic business/accounting model of immaterial vs material) is partially spliced onto it. However, as I had to keep reminding myself, the paper appears in the text not because I promoted it but because others saw it as significant, so their understanding of it matters more than mine. There’s something self-referential in that, speaking to one of the core messages about the entangled, intertwined, complex, and distributed nature of knowledge that our new definition emphasizes. It was an odd feeling, though, to see what has become of my baby now it has left home and had to fend for itself.

Abstract

Educational technologists have not settled on a fixed definition of the field and likely never will. However, attempting to define the field helps to understand the epistemological meanings that shape what the field sees, values, and considers worth pursuing. Through a critical historical review spanning over a century, alongside theoretical engagement with the concepts of entanglement and distributed agency, this paper identifies three key insufficiencies in current educational technology frameworks. These are the persistence of an instrumental-facilitative paradigm that treats technology as a resource deployed by human agents; the theoretical dissolution of the pedagogy-technology dichotomy that existing definitions have not absorbed; and the near-total silence on generative and agentic artificial intelligence, systems that now function not as tools but as active co-participants in educational processes. In response, we propose a new definition: Educational technology, as a field of inquiry and practice, encompasses the research, understanding, design, orchestration, and evaluation of entangled human-technological systems — spanning analog, digital, organizational, social, and agentic dimensions — through which learning and meaning-making are enabled, mediated, supported, and transformed. The field brings together researchers, practitioners, educators, communities, and institutions in ongoing efforts to study and improve educational experiences across formal, non-formal, and informal contexts. The field holds as a core commitment that its theory, research and practice should be ethically grounded and critically reflexive — attending to the societal implications of technological integration, with particular concern for equity and the distribution of agency among all participants in educational processes. These commitments describe what the field aspires to, not a guarantee of how all its practice is enacted. This definition is offered not as a resolution but as a basis for ongoing discussion, as it is best understood as a living definition. The field’s task is not to settle on a definition, but to keep it constantly evolving.

Reference

Bozkurt, A., Crompton, H., Farrow, R., Kukulska-Hulme, A., Dron, J., West, R., Palalas, A. (Aga) ., Bower, M., Xiao, J., Tlili, A., Henriksen, D., Pazurek, A., Huijser, H., Chiu, T.K.F., Jandrić, P., Jordan, K., Curry, J., Kimmons, R., Cukurova, M., Reeves, T., Hwang, G.-J., Shea, P., Lodge, J., Weller, M., Ng, D. and Asino, T.I. (2026) ‘Redefining Educational Technology: A Critical Collaborative Inquiry’, <i>Open Praxis</i>, 18(2), p. 192–211. Available at: https://doi.org/10.55982/openpraxis.18.2.1117.

Paradigm shifts, bricoleers [sic], and other animals

Posted May 6, 2026, updated onMay 6, 2026 by Jon Dron

Ben Werdmuller is a serial innovator, edtech veteran, and deeply insightful commentator on the tech industry whose skills defy easy categorization. I like him a lot. In One size fits none: let communities build for themselves Ben tells us about how to build digital social systems that fit the needs of their communities, and it is well worth reading if you have any interest in social software.

The post starts with description of the reaction of developers when, in the Summer of 2007, at an Elgg-jam at my then-university in Brighton, Ben first introduced the newly refactored Elgg 1.0 framework. In its several pre-version-1 iterations, Elgg was not a development framework but a full-blown web application. It had blogs, wikis, file sharing, bookmarking, groups, and much more, all wrapped up in a robust social network system with smart discretionary access, extensible very easily through a simple-to-use plugin system. It was easy to use, rich in features, highly adaptable, and it might have been the most popular open source social networking system on the planet at that point. It was a bit hacked-together and not exactly an engineering masterpiece, but it worked really well.

What Ben announced that day stripped away virtually all of its existing functionality, leaving only a tiny core that could do almost nothing user-facing on its own apart from simple user management, the display of activities, and some basic admin tasks. I don’t think it was even possible to create a post and I have a feeling there were floppy disks around at the time onto which the whole thing could fit. The idea was that it was up to developers to provide plugins that end-users could configure to create any kind of social system they wanted, with the core providing the API and data structures to support and greatly simplify their development. A few common tools like blogs, wikis, file sharing, and bookmarks were provided in a package of core plugins to help get things started, but all were (and are) optional. It was extremely elegant.

I believe that I was the person Ben refers to who, many years later (at another Elgg-jam, in San Francisco, as it happens), described his “big reveal” as a mind-blowing moment. Almost every hair on my body stood on end. I got it immediately because I had been thinking along very similar lines – there’s a chapter on such things in my first book, published earlier the same year – and had been, up until that point, intending to spend my newly-acquired national teaching fellowship money on building it. Instead I went with Elgg, which provided the framework on which the Landing and a few other sites (including the one at Brighton to which Ben refers) were built, and the money mostly went towards plugin development for it.

In fact, in the form in which it first launched, Elgg 1.0 wasn’t exactly what I wanted. My vision was more distributed and centred around small services, loosely joined, rather than a single monolithic plugin-based server. The roadmap, though, that Ben described that day made exactly that possible, with plans for a robust and extensible range of services and standards for information interchange that, had they gained any traction, would have made a federated social system of almost any kind simple to create and evolve.

They didn’t gain that traction.

I think a big part of the reason might be that, with no backwards compatibility at all with the older version, and no good migration path for those already running Elgg, it lost almost all of the momentum and good will it had previously gained, and others had moved into the space in the interim that could provide an off-the-shelf experience that was at least as good as the replacement, without the need for further development. In particular, WordPress and Buddypress were already on the rise. Ben eventually moved on to do other things, Elgg gained a loyal and slowly growing following and became a foundation, but its focus shifted to being a development platform for building bespoke servers rather than a distributed social system. The web services and neat ODD protocol never took off enough to be usable beyond some very limited use cases. However, the plugin-based architecture and tiny core was still a cool idea and building using small pieces for almost everything seemed to me to be a really good way to build a social system, so that’s what I and my teams did. It turns out to be much less cool when you want to maintain it, though, a fact that I was quite well aware of but failed to grasp in its full magnitude until it was too late.

Red Queen development

As we built the Landing we soon ran into the painful flipsides of plugins, which include the fact that you can’t easily remove them once many people use them, the large number of dependencies they create, and the fact that they have to be maintained, at least every time the core gets updated. It is not helped by the fact that, I think for efficiency, backwards compatibility is still rarely much of a consideration when Elgg gets an upgrade: though they will generally survive (with deprecation notices) for a version or two, many old plugins will simply break if they are not updated, often in subtle, difficult to debug ways. And part of the elegance of the design is also one of its greatest flaws: that, though you can design things in a more robust way, any plugin can fully override almost anything provided by any other simply by including a file of the same name and position in the directory hierarchy. This plays havoc with new versions, and makes plugins far more co-dependent than the very self-contained, well-encapsulated services I had been imagining. To make things worse, it does not scale at all well: Elgg’s object-over-relational data model is very elegant, but it is not very efficient when your site grows large, and every data-storing plugin adds to the problem.

At one point the Landing had 116 plugins (admittedly with a few turned off by default), about a third of which we built, a third of which were distributed with the core, and a third of which were community-developed. As well as our own plugins, we gradually had to take on more and more of the community-plugin development ourselves as original developers abandoned them, or face the wrath of those who needed them. Of the 90 or so that are left today, about half are our/my responsibility. On average, when things were going well and we had the funding for a full-time developer, I reckon most plugins averaged about a person-week of design, development, and testing to upgrade, though the various dependencies and bottlenecks meant that it was rarely less than a month from start to finish before they arrived on the site. Meanwhile, the core was getting updates, sometimes more than once a year. With very little spare cash, especially after losing our full-time developer, there was no way that we could ever hope to keep up with the release cycles of the core and keep the number of plugins we had to maintain. We were stuck in a Red Queen Regime, running harder and harder to stay in the same place. Some call this a technological debt, but it’s just the price of ownership, and we couldn’t pay enough.

It may be a blessing in disguise then, that, some 10 or 11 years ago, the decision over whether to continue development was taken out of our hands by a CIO who refused us any resources to even test let alone to install anything, as a result of a grossly misguided “back to baseline” principle that ravaged many good systems during his tenure, even though we (then) had plenty of money to continue and offered to put it all into his budget. The Landing limped along regardless because it was embedded in many courses, research groups, centres, and so on, so it couldn’t simply be switched off, no off-the-shelf alternative came close to doing anything similar, and we built it to be robust (though never expecting it to still be around, almost unaltered, over a decade later) so it carried on working. With the help of less hostile but never exactly enthusiastic CIOs, we have limped along ever since, very slowly creeping up through the versions on a shoestring budget and odd moments of my own spare time, but we are very far behind the cutting edge.

And then came ChatGPT

LLMs – Claude in particular – can be great at coding, especially for small projects like plugins. I have been vibe coding for a few years now, and it has been incredibly useful in many aspects of my life. However, even the best of them tend to struggle with Elgg plugins. I think it is because there is not enough Elgg code out in the wild, and there have been too many versions and too many approaches to development, so there’s not enough good quality training data. Since the first week of the launch of ChatGPT, I have been trying to get genAIs to help me with Elgg plugin upgrades and bug fixing but, though I have picked up some very helpful ideas in the midst of some very bad attempts at solutions and they have spotted a few bugs for me, not a single line of actual AI-generated code has ever made it onto the Landing. This is going to change.

A few days before Ben wrote his post, on a hunch, after some frustrating attempts at getting Claude, ChatGPT and Gemini to upgrade an existing plugin that was too difficult for me to take on alone, I instead simply asked Claude to make me a new one, with specs I had extracted from the original (using ChatGPT and tweaking the output), but giving it no access to any of the original’s source code or program structure.

Apart from a couple of minor syntax problems that took hardly a minute to fix, it worked first time. It was considerably more polished than the original and, indeed, than almost all the plugins we had written ourselves or commissioned at costs of up to $10,000. It has no deprecated code at all – something that is not even true of plugins in the core for our current Elgg version – and it has all sorts of useful little configuration options that Claude extrapolated from the specs and that I would have been too lazy to bother with, but that make it way more adaptable than its predecessor. It even has a complete set of language files for both French and English – extremely rare in human-made plugins – and it would be trivial to ask it for other languages if we needed them.

I think this works because of the different way Claude approaches the problem compared with how it handles an existing plugin. When trying to fix a broken or obsolete plugin, the plugin itself plays a large influencing role, then Claude pulls on a ragtag bunch of existing plugins as examples, but the paucity and mixed quality of the training data means they are less than wonderful role models. Almost all of its prior attempts included code from a future version of Elgg, or an older one, or one that has never existed, and it quite often did things in a very non-Elgg way. In contrast, when building a new plugin from scratch, its strategy appears to be to read the entire core codebase and all of the official documentation, then to build the plugin to fit, with little or no reference to any existing plugins beyond those that come with the core distribution. When things go wrong, it goes straight to the definitive source of a function in the core, not to a muddle of existing solutions, and its context window (at least in the paid versions) is now large enough for it to contain much if not all of the whole thing, or at least for retrieval-augmented generation to deal with the correct pieces. The small core that was so useful to human developers turns out to be ideal for LLMs.

The key lesson to be drawn from this is that, if the architecture is sufficiently and cleanly modular (as Elgg’s is), then it may now be more effective to recreate components from scratch than to maintain the ones you have already written. If it continues to pan out as it has so far done, I’d say this is a potential game changer. As well as making development extremely agile, it even improves the security of the system because, though any one plugin may yet have flaws despite the apparently high quality of coding, it is not going to stick around for long enough for them to be exploited, and anyone who follows this approach is not going to have the same plugins as anyone else so it’s not worth anyone’s while to develop a specific hack for it. The next upgrade is almost ready so I am only going to use this approach sparingly for now but, when the time comes for the next major upgrade, this is how I intend to do most of it. I won’t let it near core plugins or still-maintained community plugins but, for all those we inherited or created, ChatGPT or Gemini will provide me with the spec. I’ll then run each spec through Claude, getting it to produce the complete plugin including unit tests. It will still take time, and I don’t expect it to work as well all the time, but much of that time will be spent by Claude, not me. At one fell swoop, this almost eliminates the technological debt.

This principle is not necessarily limited to elegantly engineered systems like Elgg. A night or two ago I went through my regular quandary about how to schedule ad hoc meetings for one of my courses. In the past I’ve used wikis, discussion forums, various free (but not quite right) poll-based schedulers like Doodle, and more. None were great, and the ones that worked best raised potential privacy concerns that I was not willing to grapple with. The length of time it takes to get a plugin to production made a Landing plugin a non-starter. Then it struck me that my own personal website would be more private and controllable than any of those, and hosted on Canadian soil (unlike any of the rest) so I went in search of a plugin. WordPress is very inelegant, sprawling software, and plugin development is positively painful compared with Elgg, but the vast numbers of WP developers mean that, among the many tens of thousands of plugins, no matter what the task, at least one will do the job I want, or close enough for me to tweak so that it does. At least that had always been the case until now. To my great surprise, this time, there were none. Something like the functionality does exist in a few polling and scheduling plugins, but with very complex configurations and a lot of unwanted fluff around them, not to mention the need to get premium non-open versions to do what I want. I just wanted a small subset of Doodle’s functionality, that would not store any private data, nor cater for needs I don’t have. So I asked Claude to make it, knowing that it would already be quite skilled in WP development because of the vast number of examples to learn from. It took about 4 attempts to get exactly what I wanted. Overall the whole process took about an hour, including writing the spec, Claude’s thinking time, and the time it took to upload, configure and test it. It works really nicely. I actually spent more time earlier looking for the right software than it took to make it from scratch. I have some experience writing specs, but even a beginner could do this with a bit of help from the AI.

Ochlotecture management

I might ask an LLM to build the Spec Manager – essentially a means of managing the application architecture, not unlike a traditional source code management system – that Ben writes about, to simplify and automate some of the workflow, not that it is particularly onerous. However, the time it would save would allow me more time to work on another idea sparked by Ben’s post.

Doing what we already do, better, cheaper, and faster, is quite cool, but the most significant benefits of any new technology come from being able to do things that were previously impossible: it is the adjacent possibles they create and we exploit that drive progress. As Ben says, some of the biggest things that matter in a social system are the what, why, and for whom, and that’s very true, but there’s more. I’ve written previously of the ochlotecture of a social system, by which I mean all the human as well as non-human elements of it that make it do what it does, including the whats, whys, and for-whoms: the written and unwritten rules, the structural topography (networks, group hierarchies, set clusters, etc) , the norms around posting, the pace, the interests of the community, the cross-cutting networks, the ethical principles, the aesthetic preferences, the physical spaces they inhabit, and so on, that combine to give shape to a community. In essence it is much like a user model, only for crowds.

It strikes me that it should be possible to build an Ochlotecture Manager in much the same way as we might build the Spec Manager. Exactly how this would work is to be determined but I envisage it including an assortment of personas and scenarios as well as rules, demographics, contextual information, and network/group/set structures. The idea is to try to get away from the traditional functional definitions and instead describe relationships, policies, norms, and so on in a way that, with a bit of work, LLMs will be able to interpret and thus to better fit the site to its community. This would be particularly useful in a learning context, where a lot of software is built or chosen to perform a function, with far too little regard to how it achieves it. It almost never fits exactly what a teacher would like to do, because it ain’t what you do, it’s the way that you do it, that’s what gets results, and you can’t do the same thing the same way for everyone and expect it to be a perfect fit for all of them. The app will most like generate some YAML or JSON and instructions about how to deal with it. But this doesn’t end with the design.

A much under-utilized adjacent possible of LLMs lies in their potential to connect people and sustain communities. From summarizing conversations or connecting individuals with complementary needs, to nudging conversations or analyzing sentiment, there are many ways LLMs can catalyze interaction, not as a participant but an enabler. Having a clearly specified ochlotecture would make this much easier to achieve. It might not be a bad ochlotectural analyst, too, suggesting and implementing improvements in the design based on not user models but crowd models.

Having done that, it opens up the potential to make this a truly adaptive system, not just changing data and parameters but also the underlying code itself as a community evolves. Imagine, to give a simple example, a discussion forum in which the system observes people regularly responding with “this is great” or similar replies. The system could identify a need for some kind of rating system and, rather than simply implementing a “like” button (which is far from ideal in all situations) it could consult its ochlotectural model to identify what would work best. This could range from a simple change of wording – “recommend”, perhaps, or “rate”, depending on the community – to a multi-dimensional ranking system, that might work better if more precise feedback is needed (e.g. in peer review). More complex changes are possible: it might build a system to (say) manage events, or create photo albums, or implement breakout spaces, or shift between threaded and non-threaded discussions. Perhaps it could shuffle menus to better fit community needs, or fix accessibility issues, or identify more relevant posts. I’d be extremely nervous of taking humans out of that loop – that way disaster lies – but perhaps the humans would not need to be developers as long as a developer had crafted the spec and the ochlotecture carefully enough in the first place. Community members themselves could suggest things, the LLM could present them to the group (perhaps creating a poll system for voting, or some other dispute-settling mechanism to do so), and it could use the ochlotectural and architectural models to help guide the actual development. It might even do a bit of proactive A/B testing, making an evolutionary (survival of the fittest) approach possible. Ultimately, it might even evolve how it evolves, developing its own strategies for engaging the community and responding to changing needs. It would be no more annoying that it constantly changes than it is for existing cloud services, with the added benefit that, if the community doesn’t like it, they can fix it.

In my perfect world all of this would rely on a local, open LLM but, though some are now extremely good for coding assistance, none currently have the large context windows and sophisticated tuning of the bigger commercial models. This will probably change. A hybrid approach might work in the interim, where the local model deals with everything apart from the coding itself, and the commercial model does the rest, but I’ve not thought through the economics of that.

Bricoleering: a new paradigm?

We are at the bottom of a learning curve with genAI right now. Most of us are simply replacing things we already do with LLMs, and that is highly problematic for reasons I and many others have written about extensively (see at least half my posts at https://jondron.ca/ai). In a world with machines that can creatively replicate almost any human cognitive skill, often at an expert level, there are high risks that our descendants are going to lose at least a portion of their own capacity to do so unaided. That’s not necessarily a bad thing, in itself. Few of us can still recite every word of a novel from memory, or create a bow and arrow, or perform complex mental arithmetic, because we don’t need to. Coarse grained cognition – thinking in bigger chunks, using the products of our own and other humans’ thought – is what has let us build pyramids, spaceships, welfare systems and virtually every invention ever, including this sentence. It’s our collective, extended cognition that makes it possible to constantly create more. That’s more of a problem when creativity itself is at stake, however, because we risk delegating too much of it to the machine, and allowing our own capabilities to atrophy. Already, I quite often tell the machine what I’m trying to do then ask it for a list of ideas and select one, rather than trying to think of one myself: that’s how the picture at the top of this post was conceived. At scale, this is not a great idea. If the world is going to be a better and not a worse place, we need to learn to be creative with the creative outputs of the cognitive Santa Claus machines, not simply to specify and use them. I think that the idea I suggest above is one of the ways this can happen. A plugin-based (or other component-oriented) approach enables us to do bricolage with the pieces, assembling, disassembling, and reassembling them in new and creative ways that neither we nor genAIs could do alone. It is not Levi Strauss’s bricolage of the “savage mind”, however, nor is it engineering. I think it is a new paradigm in which we do not simply assemble pieces we happen to have lying around but actively help to shape them so that they will fit. Our roles are closer to those of architects like Frank Gehry, who famously couldn’t use the machines that were essential to creating his iconic machine-made designs, instead relying on hand-drawn sketches to communicate his idea to those who could. I don’t know what to call this: “bricoleering” perhaps, or “adaptafacture”?

Edison’s Infinite Workshop: Innovation and education in the age of Cognitive Santa Claus Machines (slides from my keynote for IFERP’s EdInnovate 2026)

Posted April 26, 2026 by Jon Dron

Nek Chand's Rock Garden, illustrating the power of bricolage as a creative process — Statues in Nek Chand’s Rock Garden (photo by the author)

I’ve just finished giving a brief keynote for IFERP’s 3rd EdInnovate conference in Tokyo (sadly, because I love Tokyo in the Spring, I was online). Here are the slides. The conference was great: they put all of the keynotes and invited talks on a single day, with a very international and cross-disciplinary bunch of thought leaders (and me), and many of us were talking about very closely related themes, of rehumanizing and transforming education, from very different perspectives. Though most of it confirmed what I already know, I learned a lot.

The gist of my talk was that generative AI challenges us to transform both how we teach and what we teach. I have spoken quite a bit about the “how” in the past – essentially it is to double down on the tacit, the relational, and the social, to care about and to empower learners, to focus on what it means to be a human in whatever fields we are trying to teach. The stuff we should already have been doing.

The “what” is new. GenAIs are pretty good at creating stuff, and that’s a problem because it is very, very tempting to get them to think for us (hence cognitive Santa Claus machines: we delegate the thinking to them so that we don’t have to). We now have access to most human knowledge, at a (mostly) expert level, with little skill needed to elicit any of it. These things are like search engines that actually give us what we are searching for, in detail, and then do whatever it was that we were planning to do with the search results on our behalf. If our descendants are not to be less than us (and I really want more for my own grandchildren), we now have to figure out what to do with that. If the answer is to turn in an essay or perform an assignment that any AI could do at least as well, then the world will end with a whimper. Our jobs are to take that, problematize it, and use it to create more than any of us (human or machine) could have created alone. Luckily we already have a model for that: bricolage, or tinkering.

Bricolage has got a bad rap in the past, often compared unfavourably with engineering (notably by Levi Strauss, who defined it and saw it as primitive) but, as Papert and Turkle wrote many years ago, it is a very legitimate way of engaging with the concrete, a highly creative activity in its own right, and it can be a very powerful approach to design. The photo at the top of this post shows just a handful of the thousands of stunning artworks created by Nek Chand and his team, all of it built from the waste products of the industrial city of Chandigarh – pieces of wire, chunks of porcelain, sacks of concrete, and other found objects. I have visited twice and cried at the beauty of it both times.

I have written of bricolage before, e.g. here and here (nicely reported on and more clearly expressed by Stefanie Panke), as a means of researching things that don’t (yet) exist, and I intend to write more. It seems to me, though, that this is one of the key skills that we should be developing for ourselves and for our students, not just for research but as a process and product of learning. It is the natural evolution of the steady progress from high-resolution to low-resolution cognition that has driven human progress for millennia. In the past we built on and with what other humans had already done: it is and has always been what makes us smart that we can, through technologies (including language and art), share parts of our cognition: we think with our creations. The more we create, the more we can create. Now we have machines that are themselves bricoleurs par excellence, capable of producing any parts or pieces we can imagine, at vast scale, and quite a few we cannot. This is different. If we take advantage of it, we can continue the technology-fuelled exponential growth that is a hallmark of our species (and, to be perfectly clear, art, writing, poetry, architecture, music, and all the humanities are among the most significant of those technologies). If we don’t, we face not just the model collapse of genAIs but, ultimately, of our own cognition. This is not about replicating what we can already do. It’s about being able to do what we cannot yet imagine. This seems like a good mission for education to me.

More than a game: some thoughts on David Wiley’s “Random Audits as a Scalable Deterrent to Cheating”

Posted March 28, 2026, updated onMarch 28, 2026 by Jon Dron

Source: Random Audits as a Scalable Deterrent to Cheating: Using Game Theory to Design Fair and Effective Academic Integrity Systems for the AI Era by David Wiley Though not particularly common, the general principle of only assessing a sample of work with oral exams (viva voces) is well established, and is common practice in a number of institutions (e.g. UC Berkeley or UC London). What’s smart and novel about David Wiley’s new variation on the theme is the rigour with which he approaches the problem. The headliner is his use of game theory to identify the optimum sample range (no point in auditing mediocre results or fails), sample rate (to make the risk of detection significant enough to deter wrongdoers), penalty for failure (neither so small that the risk is acceptable nor so large that people are deterred from applying it), and appropriate audit bonus (so honest students gain some but not too much benefit from being audited to make up for the discomfort, inconvenience, and pain). It’s a nicely balanced process, playing with the incentives so as to take some of the sting out of being selected to be assessed by offering opportunities to increase grades. There’s also a lot of careful thought given to the administrative and pedagogical details of how to make it all work, so that students are forced to think clearly about the pros and cons of cheating, and it is all done fairly and efficiently. It’s a very well considered set of techniques for reducing the faculty workload and reducing the chances of cheating.

For all that is good about it, I think it’s almost exactly the wrong idea, though I have an idea to save it.

Problems with oral exams

For the majority of students in search of credentials, oral exams are at the better end of the summative assessment spectrum, because they are:

efficient (on average, it takes no longer to ascertain someone knows what they are talking about than it does to properly mark an exam or assignment and, crucially, it demands less time from the student),
reliable (very hard, though not impossible to fake or cheat),
personal (you can explore personal strengths and misconceptions),
responsive (feedback can be immediate),
social (caring can be demonstrated),
often authentic (depends on context), and, above all,
useful learning experiences in their own right, for all concerned, including examiners.

In universities, oral exams predate written exams by many, many centuries. It was by far the most common way to assess students for credentials right up to at least the 19th Century, and it generally worked well, notwithstanding the problems dealing with geometry and other visual disciplines that led to the Cambridge Tripos (the first modern written exams) in the late C18th. It’s still very popular in some regions, especially for higher degrees, though it has fallen out of favour across much of higher education because it is hard work and difficult to scale. While each one is quite efficient in itself, when you have to do schedule a few hundred of them it really eats into your time and energy. There are some major issues for students who have speech impediments, hearing problems, or who are simply using a foreign language, so alternatives or workarounds must be available, and extraordinary care must be taken to avoid personal biases because it is prohibitively expensive and impractical to anonymize them. All in all, though, for most students it is one of the least bad of a bad bunch.

Unfortunately, oral exams have one very fatal flaw inasmuch as, far more than for written exams (which are unpleasant enough for most students), they can be incredibly intimidating. Few students actually like them but, for a significant number, they are beyond mortifying. I have known students to freeze, cry, walk out, and even fail an entire PhD (though that was later corrected) as a result of having to defend their work this way. The stress can be mitigated somewhat with counselling, therapy, practice, caring tuition, and sensitive questioning, but it is difficult if not impossible to completely eliminate this problem, and time spent developing counter-technologies to the technology of assessment is time better spent learning the subject in question.

I think that David’s rational game-theoretic approach fails to take this sufficiently into account. For students facing the prospect of extreme trauma, no matter how competent they might be in the subject, the most rational course of action in David’s system would often be to aim for a low mark that would not get audited rather than risk having to be examined. There are plenty of students who don’t need high GPAs, for whom a straight pass is a rational choice. However, in itself, this would be a risky strategy because it is really difficult to tread the fine line between a low pass and a fail or higher pass, either of which would be very bad news, all of which would add stress not just at exam time but throughout the course. Under such circumstances, a student who had taken the game theory to heart would probably realize that the most effective way to be likely to get a low pass would be to ask a generative AI to produce work that that level: in my own experiments I have found them to be remarkably good at targeting a particular grade, as long as you feed them half-decent rubrics.

It is also far from infallible, because few of us are rational game players. On the whole, cheating tends to occur when students are very stressed and they panic: it’s often barely a rational choice at all. Few actually want to cheat and all of them already know it is a risky option: it’s just the least bad of a limited number of very bad alternatives. Making the risks higher and quantifying them is not a solution to this. If anything, for at least a few of the most at-risk students, it will just make the problem worse because the pressure is greater. Also, for the truly disengaged students who are most likely to cheat, this might just be another thing they do not learn, so they would not even be playing the game, though they would certainly come to regret it if they were audited.

Sampling problems

Another problem with David’s approach is that it is a very much stronger signal of the authority and control that the teacher/institution has over the the student than the conventional process, with no pretence that it serves any further purpose than to catch cheats. If it were to support learning then everyone should be doing it, and the fact that there is a reward for being audited just further emphasizes that it is an undesirable activity that students are being forced to do. At least as bad, it doesn’t just allow but it actively recommends an instrumental approach to learning: it literally teaches students how to game the system. For anyone wanting to use this approach, I would therefore strongly recommend combining it with ways to attempt to restore lost autonomy, for example by encouraging students to design some of their own outcomes, or to have input into the means of assessment, or to have plenty of flexibility in the timing of submissions, or at the very least to be able to choose different ways of demonstrating their competence from a range of options. Among the benefits of doing this, the chances of them cheating in the first place would be significantly reduced.

There is also a time commitment to learning how to play that game rather than learning the stuff the course is actually about. I don’t see an easy way of avoiding this altogether though, if it were applied across the board to a whole program, the proportion of time spent on it could be reduced for each course. It would be a brilliant idea to use it in a course on game theory, of course.

It bothers me that the method deliberately excludes students who don’t get great results. It seems to me that they are the ones who would most benefit from a chance to improve them, so it amplifies the divide between the haves and the have-nots. At the very least, it should be possible for such students to ask for an oral exam, under the same conditions as those who get selected for random testing. The selection process again sends a bad message: that high achievement makes you a suspect.

While the proposed sample rates make sense for a single course, if all courses worked this way then, by the end of the program, almost every student would have at some point been audited, most likely more than once. For someone with a strong phobia, this might actually be worse than having to do it for every course: knowing that, at any point, your worst nightmare is going to happen is probably not going to improve your chances of persisting to the end of a program. It’s a problem both in the stress-filled build-up and (if not selected) the massive surge of relief that follows. The pain/relief patterns are not dissimilar to those of, say, gambling or drug addiction.

Motivation problems

David claims that it is not a technology problem but an incentive problem. I disagree. This very much is a technology problem, and David’s solution is totally a technological solution: it’s just not a digital technology problem. And, in the context of the technology in question – that of credentialing – it is not an incentive problem but a motivation problem. Treating it as an incentive problem limits it to the subset of motivation that is both extrinsic and externally regulated: the worst possible kind. Externally regulated extrinsic motivation reliably kills intrinsic motivation so this both takes away the love of simply doing the work and actively harms motivation to do so in future.

The trouble with David’s solution is that it doesn’t deal with or consider the reasons that students cheat in the first place: it’s just a response to the fact that some do. Vanishingly few students start out a course with the intention of cheating their way through it. Rather, the pressures they face (almost all extrinsic) make cheating a rational response and/or the result of panic. All that David’s solution does is to make it a bit less rational. Students will still do it for irrational, emotionally charged reasons, and it not only does nothing to eliminate the root causes but it actually amplifies them, piling on additional pressure.

Like all technologies, there are other ways to solve this problem and, like all technologies, it is a Faustian bargain that creates new problems of its own. David’s solution, with the aforementioned provisos, is a potentially effective and efficient solution to cheating but it is likely to have the opposite effect on learning, especially once the course is over. It’s just a counter-technology for dealing with flaws in the underlying credentialing approach, and it demands further counter-technologies of its own to deal with its big fatal flaw if it is going to work at all well. It’s not at all unusual in this.

A better solution?

I think this is fixable. I reckon David’s solution would work a lot better if, instead of auditing assignments or exams for a single course, it were applied to a basket of courses (say, 3-6 of them) and, in the oral exam, students were asked to synthesize, connect and utilize what they have learned in all of them. This is not unlike some fairly common approaches to PhDs or capstone projects, where students create something then talk about it in more or less formal ways (presentations, demos, crits, viva voces, etc). If done with commitment, it could largely decouple learning and assessment because instrumental revision would not be an option: the only way to revise effectively would be to engage in positive learning activities that involve exactly the kind of synthesis we would examine, which would make it personal, relevant, and interesting, especially if (to make it authentic) it were done with other people.

With a bit of ingenuity, it might be possible to remove all grades and credit for the courses themselves, so students could learn without the usual extrinsic pressures. Every student would automatically get a provisional generic pass on each of the basket of courses, no questions asked. If they were audited then they might improve that (or fail), as David suggests. For the sake of equity, every student would have the right to ask to be audited, so the high-flyers who cared about getting a high grade could have an opportunity to get one. The rest could learn with significantly reduced pressure.

An obvious objection is that it would increase the high stakes when that assessment did actually happen. One way to reduce that problem would be to allow repeated attempts, with no additional penalty, or to make it a “best of three” of something along those lines. Though that would somewhat reduce the efficiency of the solution, as long as it were structured to make it relatively rare, it would be worth the extra bother. It would also be good to provide coaching, counselling, and plentiful opportunities to practice. For some subjects there might be less pressured approaches than oral exams that would achieve similar results, such as observation studies of them working on a problem, or group discussions, or structured peer interviews. Perhaps it could be a series of conversations throughout the program, none of which carries a definitive grade in itself but that, together, add up to an overall assessment. There’s scope for further innovation here.

It would be more important than ever to provide plentiful formative assessment during the courses themselves, and to provide ways of practising those skills in synthesis. The latter could be done within those courses or, perhaps better, a “synthesis” course could be provided for this purpose, operating in much the same way as Brunel’s assessment modules in their Integrated Programme Assessment approach. Among the advantages of this, it would allow students to do some work that might be used as part of an alternative assessment for those suffering from extreme fear of or difficulties participating in the oral exam.

It is not perfect, and it would be no use for situations such as those at Athabasca University, where many students are taking only one or two courses, often as visitors from other programs. However, for program students, even more than David’s approach, this would massively reduce the marking burden while making a positive contribution to learning and motivation to learn.

Is higher education broken? Not exactly.

Posted March 3, 2026, updated onMarch 3, 2026 by Jon Dron

What does it mean for higher education to work?

The problem with claiming (as I sometimes do) that higher education is broken and needs to be transformed is that it begs the question of what it means for higher education to work, and that depends what you think it is for.

From the name you’d expect that higher education might be for …well… education, assuming that to be concerned with learning and teaching, but it outgrew that single purpose a very long time ago. Yes, learning & teaching still looms large, but credentialing is at least as significant (often more so) and, at least for some, so are research or various forms of service. But, depending on your perspective and context, a university or college might also or alternatively be thought of quite differently as, for example:

a driver of peace or prosperity in a society;
a creator of knowledge in the world;
a support for local economies;
training for industry;
a market for contract cheating;
a home for sports teams;
a sharer and preserver of cultural artifacts;
an incubator for the performing arts;
a means to get a better job;
a medical facility;
a production line for professors;
an enabler of social mobility;
a profit-/surplus-making business;
a political pawn;
a selection filter for smart people;
and so on, and on, and on.

You might reasonably object that you could take any one of these away apart from the teaching role and you would still be left with a recognizable educational institution and, indeed, some are possible only because of the teaching role. However, to some people, somewhere, some time, every one of those roles is the role that matters most, and might be a target for transformation. Like every instantiated technology, a university or college is an assembly. In fact it is a huge assembly. It is part of and contains countless other assemblies, and is thoroughly, deeply entangled with a host of other systems and subsystems on which it depends and that depend on it. Everyone within it or interacting with it perceives it from a different perspective, in different ways at different times, working together or independently as mutually affective coparticipants to do whatever it is that, from each of those different perspectives, it does. In many ways, as a whole, it thus resembles an ecosystem and, like an ecosystem, each individual part can be perceived as having a goal and a relationship with other parts, and with the whole, but the whole itself does not. I think this is probably a feature of institutions in general, and may be what distinguishes them most clearly from simple organizations and businesses.

So what?

As long as the distinct roles, from each individual’s perspective, do their jobs, this is not a problem. If you are interested in, say, in getting an education then you can largely ignore everything else an educational institution does and judge it solely by whether it teaches, notwithstanding the huge complexities of knowing what that even means, let alone with what proxies to measure it.

Unfortunately, a fair number of these roles deeply and negatively impact others. For me, by far the biggest problem is that the credentialing role is fundamentally at odds with the teaching role, due to the profound negative impact of extrinsic motivation on intrinsic motivation (I’ve written a lot about this, e.g. in these slides and in How Education Works so I won’t repeat the arguments again here). Combined with the side effects of trying to teach everyone the same thing at the same time, this results in the vast majority of our most cherished teaching and assessment methods being nothing more than ways of restoring or replacing the intrinsic motivation sucked out of students by how we teach and assess. Other big conflicts matter too, though. For instance, when patents or copyrights are at stake, the business role battles with the underlying goal of increasing knowledge in the world, turning non-rival knowledge into a rivalrous commodity; ditto for the insanity that is journal publishing, where the public pays us to provide our editorial and reviewing services for papers on research that they also pay for, then the journals sell the papers back to us or charge us for sharing them, making obscene profits for an increasingly trivial service; similarly, the research role, that should in principle exist in a virtuous circle with teaching, is too often in competition with it and, in many institutions, teaching loses; the filtering role that rewards most universities (not mine) for excluding as many students as possible is in direct conflict with a mission to bring higher forms of learning to as many people as possible, and undermines the incentive to teach well because those carefully selected students will learn pretty well regardless of how well they are taught. There are countless other examples like this: public vs private good, excellence vs equity, local vs global responsibilities, supporting student diversity vs economic stability, and so on. Fixing one role invariably impacts others, usually negatively. These are structural issues that will persist as long as higher education continues to play those roles. The solutions to the problems in one role are the problems that other roles have to solve, and (to a large extent) they must be.

At a micro scale the problem is even more ubiquitous. Everyone is solving problems in their own local sphere, creating problems for others in their own local spheres, whose solutions cause problems for others, and so it goes around and comes around. Every time we create a solution to one problem we give rise to other problems elsewhere. To give a few trivial and commonplace examples of issues I am trying to deal with right now:

I recently learned of two courses that could not be launched because tutors for the single course that they replace would have to be rehired and lose benefits gained for long service. In terms of priorities and primary roles, this implies that offering stable employment to staff matters more than teaching. That’s not the intent of any particular individual involved in the process but it’s how the system works, thanks to union agreements that solved different problems a long time ago.
For nearly 50 years now, our undergraduate students have had 6 months to complete a course, unless they are grant-funded (an important minority), in which case they only get 4 months because funding bodies assume universities always teach in semesters of a standardized length and demand results within that timeframe. And so we are in the process of making all contracts 4 months, knowing full well that students will be more pressured, cheating will increase, and pass rates will go down, but at least it will be fairer.
When we commit structures to code they are supposed to model the system but, having done so, they normally dictate it. For instance, my need for all of our faculty to be able to see the teaching sites of all of our courses (a critical part of my strategy to improve our teaching) is under threat due to the cascading roles used to determine who can do what that are baked into the implementation of our LMS and that make it difficult and long-winded for our editors to edit our courses, because the roles have to be modified each time they use its impersonation function that is necessary for viewing courses as they will be experienced. The obvious solution is to fix those roles, not remove access for those who need it, but the editors lack such rights, and those who have them support other faculties with different and conflicting needs.
We have recently shifted to a centralized front-line support system, explicitly to deal with common difficulties students have in navigating and using our administrative systems and websites. The more obvious solution would be to make those systems work better in the first place. Instead, we employ vast numbers of people whose job it is to patch over gaps, errors, and poor design decisions made elsewhere. This reduces the pressure to fix the systems, so the need persists, except that now we have a whole load of people with jobs that would be in jeopardy if we fix them. We employ many people whose job is to fix problems caused by issues with how others do theirs: people dedicated to exam cheating, say, or accommodating disabilities, or the aforementioned editors. There’s a fine and indistinct line between dividing a workload so that people with the right expertise do the right things, and creating a workload because people with the wrong expertise have done the wrong things.

I could easily write pages of similar examples and, if you work for a university or college, I’m sure you could too: the specific problems may be peculiar to Athabasca University, but the underlying dynamics are ubiquitous in higher education and, for that matter, most large organizations. And I’m sure that you can think of ways to deal with any of them but that’s exactly the point: fixing them is what we all do, all the time, every day, on a grand scale, and educators have been doing so for nearly 1000 years so the number of fixes to fixes to fixes to fixes is vast. For almost any role or activity, no matter how small or how large, there is probably another role and set of activities on which it impinges, directly or otherwise.

The big problem is that, on the whole, we create counter-technologies to fix the worst of the problems and that’s a policy of despair, every counter-technology creating new problems for further counter-technologies to solve. In fact, a large part of the reason for all those many roles is precisely because counter-technologies were created to solve what probably seemed like pressing problems and, in an inevitable Faustian bargain, created the problems we now need to address. Every one of these counter-technologies increases the robustness of the whole, increasing the interdependencies, making the patterns more and more indelible so, even if we do occasionally come up with something truly different, the overall system holds together as a massive web of mutually interdependent pieces more strongly than ever.

The more things change…

For all the many structural problems, it would be a synecdochic fallacy of mistaking the part for the whole to describe higher education as broken. Sure, thanks to all those competing roles (especially credentialing) it is not particularly great at education (at least), so transformation is devoutly to be wished for but, by the most basic and essential criterion of all – survival – it is rampantly successful. In fact, it is exactly those competing and complementary roles that have sustained it because a diverse ecosystem is a resilient ecosystem. The webs of dependencies are mutually sustaining even, to a well-evolved point, when one is antagonistic to the other.

For nearly a millennium the university and its brethren have not only survived but have now spread to almost every populated region of the world, and they continue to expand. Within my lifetime, in my country of birth, enrolments in higher education have risen from around 5% of the population to around 50%. To achieve such success, it has had to evolve: the invention of written exams, say, in the 18th Century, Humboldtian models that justified and embedded research, the adoption of flexible curricula, or the admittance of women in the 19th Century, were all huge changes. It has lost the trivium and quadrivium along the way, and diversified enormously in the range of subjects taught. The technological systems are way more advanced and varied than they were. There are regional variations, and a few speciated niches (colleges, open universities, distance education, etc). Administratively, a lot has changed, from recruitment and enrolment to the roles of professional bodies, industry, and governments. It is constantly evolving, for sure.

But.

The main technological features that universities acquired in the first century of their existence are still fully present, in virtually unaltered form. Courses, classes, terms/semesters, professors, credentials, methods of teaching, organizational structures, methods of assessment, and plenty more are visibly the same species as their mediaeval forebears, and remain the central motifs of virtually all formal higher education. We may use a few more polyesters and zippers, and the gowns now come in women’s sizes but, at least once a year, many of us even dress the same, a behaviour shared with only a few other institutions like (in some countries) the legal profession or the church. On the subject of which, most universities continue to have roles like dean, chancellor, rector, provost, registrar, bursar and even the odd beadle (what even is that?) that not only reveal their ecclesiastic origins but also how little the basic entities in the system have since evolved.

If the purpose of higher education were simply to educate then we would expect it to work a lot better and to see a whole load more variation in how it is done, especially given the wide range of technologies that can now be used to overcome the problems caused by those features, but we don’t. It’s not just the purpose that survives: it’s the form. We can radically alter a great many processes but changing at least one or two of the central motifs themselves – which, to me, is what “transformation” must entail – is hardly never even on the table.

Adaptation, not transformation

If the institution had a clear overriding goal then we could re-engineer it to work differently, but this is not an engineering problem: it’s an evolutionary problem. We build with what we have on what we have, a process of tinkering or bricolage that is anything but engineered. It is, though, not natural but technological evolution. In natural ecosystems massive disruption can occur when populations become isolated, or when the environment radically changes. Technological evolution emerges through recombination and assembly of parts, not genes, and the technologies of higher education have evolved to be globally connected and massively intertwingled with nearly every other part of nearly every society, making isolation virtually impossible. In nature, ecosystems can be disrupted by invasive species, parasites, etc, but our educational systems – technologies one and all – have evolved to be great at absorbing stuff rather than competing with it, so even that path is fraught. Even something as apparently disruptive as generative AI, which is impacting almost every aspect of the system and all the systems with which it interacts, is currently causing reinforcement of objectives-driven models of teaching, (at least in Western countries) cultural individualism, and highly traditionalist solutions to fears of cheating like written and oral exams at least as much as it is inspiring change.

For those of us who care about the education role, there are plenty of ways we could actually transform it if we had the power to make the necessary changes. Decoupling learning and assessment would be a good start. Not just separating teaching and tests: that would just result in teaching to the test, as we see now. The decoupling would have to be asymmetrical, so the assessed tasks would demand synthesis of many taught things. Or we could get rid of classes and courses: to a large extent, this is what (despite the name) many Connectivist MOOCs have attempted to do, and it is also the pattern behind things like the Kahn Academy or Connect North’s AI Tutor Pro, not to mention traditional PhDs (at least in some countries), apprenticeship models of learning, most instructional videos on sites like YouTube, or Stack Exchange or Quora, and the bulk of student projects (like MOOCs, labelled as courses but lacking most if not all of their traditional trappings). Or we could keep courses but drop the schedules and time limits. If nothing else, imagining how things might work if we messed with those central motifs is a good way to stimulate creative use of what we have. If done at scale, such things could make a huge impact on our educational systems.

But they probably won’t.

The problem always comes back to the fact that, though (collectively) we could change the fitness landscape itself, making survival dependent on whatever we think matters most, we are unlikely to agree what does matter most. For some, better higher education would be measured in credentials, or explicit learning outcomes, or better fits with industry needs. Others would like it to advance their personal careers or status, or to do research without a profit motive. For me, improvements would be in far harder-to-measure aspects like building safer, kinder, smarter, more creative societies. Unfortunately (for me and others who feel that way), thanks to pace layering, the ones who could shape the fitness landscape the most are governments, and they are the least likely to do so. Governments tend to prefer things that are easier to measure, quicker to show results, that are most likely to keep voters voting for them and sponsors (especially from industry) sponsoring them. Increasingly, institutional mandates are measured by industry-impact, which does erode some traditional aspects of higher education but that reinforces the big ones, like the measurable, assessed, outcome-driven course, with its classes, its schedules, its semesters, its textbooks, its assessments, its teachers, and so on. It doesn’t have to, in principle but, in practice, those are not the things we adapt. If radical transformation ever does occur it will therefore most likely be the result of something so disruptive that the loss of higher education would be a minor concern: devastation caused by climate change, or nuclear war, or being hit by a large asteroid, for instance. And, to be honest, I’m not even sure that would be enough.

The limited chances of success should not discourage us from tinkering, all the time, whenever we can. Evolution must happen because the world that higher education inhabits evolves so, if this is the system we are stuck with, we should make it do what we want it to do as best we can. There are usually ways to reduce dependencies, techniques to decouple antagonistic roles, strategies of simplification, approaches to parcellating the landscape (skunkworks, etc), and values-based principles for prioritizing activities that can make it more likely that the changes will be successful and persistent. However, if we have learned anything from biological studies over the past many decades, it is that you shouldn’t mess with an ecosystem. Whatever we do will put it out of balance, and self-organizing dynamics will ensure that either the balance will be restored, or that it spirals out of control and breaks altogether. Either way, it will never be exactly what we planned and, on average, it will tend to eventually keep things much the same as they are while making most of it worse while it restabilizes itself.

Knowing that, though, can be useful. If every change will result in changes elsewhere, it is not enough to monitor the direct impact of an intervention: rather, we need to figure out ways of harvesting the outcomes across the system and/or, as best we are able, to model them in advance. No one has access to more than a fraction of the information needed, not least because a because a significant amount of it is tacit, embedded in the culture and practices of people and communities within the system. However, we can try to intentionally capture it, to tell stories, to share experiences and understandings across all those many niches. We can do what we can to make the invisible visible. We can talk. And we have technologies to help, inasmuch as we can train AIs to know our stories and ask them about the impacts of things we do, and point out impacts that would be difficult if not impossible for any person to do. And that, I think, is the only viable path we have. The problems we generally have to deal with are a direct result of local thinking: solutions in one space that cause problems in another. The less locally we think about such things, the greater the chances that we will avoid unwanted impacts elsewhere or, equally good, that we will cause wanted impacts. To achieve that demands openness and dialogue, channels through which we can share and communicate, and some way of compressing, parsing, and relaying all that so that sharing and communication is not the only thing we ever do. This is not an impossibly tall order but it certainly isn’t easy.