Skills lost due to COVID-19 school closures will hit economic output for generations (hmmm)

Snippet from OECD report on covid-19 and education This CBC report is one of many dozens of articles in the world’s press highlighting one rather small but startling assertion in a recent OECD report on the effects of Covid-19 on education – that the ‘lost’ third of a year of schooling in many countries will lead to an overall lasting drop in GDP of 1.5% across the world. Though it contains many more fascinating and useful insights that are far more significant and helpful, the report itself does make this assertion quite early on and repeats it for good measure, so it is not surprising that journalists have jumped on it. It is important to observe, though, that the reasoning behind it is based on a model developed by Hanushek and Woessman over several years, and an unpublished article by the authors that tries to explain variations in global productivity according to amount and  – far more importantly – the quality of education: that long-run productivity is a direct consequence of the cognitive skills (or knowledge capital) of a nation, that can be mapped directly to how well and how much the population is educated.

As an educator I find this model, at a glance, to be reassuring and confirmatory because it suggests that we do actually have a positive effect on our students. However, there may be a few grounds on which it might be challenged (disclaimer: this is speculation). The first and most obvious is that correlation does not equal causation. The fact that countries that do invest in improving education consistently see productivity gains to match in years to come is interesting, but it raises the question of what led to that investment in the first place and whether that might be the ultimate cause, not the education itself.  A country that has invested in increasing the quality of education would, normally, be doing so as a result of values and circumstances that may lead to other consequences and/or be enabled by other things (such as rising prosperity, competition from elsewhere, a shift to more liberal values, and so on).  The second objection might be that, sure, increased quality of education does lead to greater productivity, but that it is not the educational process that is causing it, as such. Perhaps, for instance, an increased focus on attainment raises aspirations. A further objection might be that the definition of ‘quality’ does not measure what they think it measures. A brief skim of the model used suggests that it makes extensive use of scores from the likes of TIMSS, PIRLS and PISA, standardized test approaches used to compare educational ‘effectiveness’ in different regions that embody quite a lot of biases, are often manipulated at a governmental level, and that, as I have mentioned once or twice before, are extremely dubious indicators of learning: in fact, even when they are not manipulated, they may indicate willingness to comply with the demands of the powerful more than learning (does that improve GDP? Probably).  Another objection might be that absence of time spent in school does not equate to absence of education. Indeed, Hanushek and Woessman’s central thesis is that it is not the amount but the quality of schooling that matters, so it seems bizarre that they might fall back on quantifying learning by time spent in school. We know for sure that, though students may not have been conforming to curricula at the rate desired by schools and colleges, they have not stopped learning. In fact, in many ways and in many places, there are grounds to believe that there have been positive learning benefits: better family learning, more autonomy, more thoughtful pedagogies, more intentional learning community forming, and so on.  Out of this may spring a renewed focus on how people learn and how best to support them, rather than maintaining a system that evolved in mediaeval times to support very different learning needs, and that is so solidly packed with counter technologies and so embedded in so many other systems that have nothing to do with learning that we have lost sight of the ones that actually matter. If education improves as a result, then (if it is true that better and more education improves the bottom line) we may even see gains in GDP. I expect that there are other reasons for doubt: I have only skimmed the surface of the possible concerns.

I may be wrong to be sceptical –  in fairness, I have not read the many papers and books produced by Hanushek and Woessman on the subject, I am not an economist, nor do I have sufficient expertise (or interest) to analyze the regression model that they use. Perhaps they have fully addressed such concerns in that unpublished paper and the simplistic cause-effect prediction distorts their claims. But, knowing a little about complex adaptive systems, my main objection is that this is an entirely new context to which models that have worked before may no longer apply and that, even if they do, there are countless other factors that will affect the outcome in both positive and negative ways, so this is not so much a prediction as an observation about one small part of a small part of a much bigger emergent change that is quite unpredictable. I am extremely cautious at the best of times whenever I see people attempting to find simple causal linear relationships of this nature, especially when they are so precisely quantified, especially when past indicators are applied to something wholly novel that we have never seen before with such widespread effects, especially given the complex relationships at every level, from individual to national.  I’m glad they are telling the story – it is an interesting one that no doubt contains grains of important truths – but it is just an informative story, not predictive science.  The OECD has a bit of track record on this kind of misinterpretation, especially in education. This is the same organization that (laughably, if it weren’t so influential) claimed that educational technology in the classroom is bad for learning. There’s not a problem with the data collection or analysis, as such. The problem is with the predictions and recommendations drawn from it.

Beyond methodological worries, though, and even if their predictions about GDP are correct (I am pretty sure they are not – there are too many other factors at play, including huge ones like the destruction of the environment that makes the odd 1.5% seem like a drop in the barrel) then it might be a good thing. It might be that we are moving – rather reluctantly – into a world in which GDP serves as an even less effective measure of success than it already is. There are already plentiful reasons to find it wanting, from its poor consideration of ecological consequences to its wilful blindness to (and causal effect upon) inequalities, to its simple inadequacy to capture the complexity and richness of human culture and wealth. I am a huge fan of the state of Bhutan’s rejection of the GDP, that it has replaced with the GNH happiness index. The GNH makes far more sense, and is what has led Bhutan to be one of the only countries in the world to be carbon positive, as well as being (arguably but provably) one of the happiest countries in the world. What would you rather have, money (at least for a few, probably not you), or happiness and a sustainable future? For Bhutan, education is not for economic prosperity: it is about improving happiness, which includes good governance, sustainability, and preservation of (but not ossification of) culture.

Many educators – and I am very definitely one of them – share Bhutan’s perspective on education. I think that my customer is not the student, or a government, or companies, but society as a whole, and that education makes (or should make) for happier, safer, more inventive, more tolerant, more stable, more adaptive societies, as well as many other good things. It supports dynamic meta-stability and thus the evolution of culture. It is very easy to lose sight of that goal when we have to account to companies, governments, other institutions, and to so many more deeply entangled sets of people with very different agendas and values, not to mention our inevitable focus on the hard methods and tools of whatever it is that we are teaching, as well as the norms and regulations of wherever we teach it. But we should not ever forget why we are here. It is to make the world a better place, not just for our students but for everyone. Why else would we bother?

Originally posted at: https://landing.athabascau.ca/bookmarks/view/6578662/skills-lost-due-to-covid-19-school-closures-will-hit-economic-output-for-generations-hmmm

Black holes are simpler than forests and science has its limits

Mandelbrot set (Wikipedia, https://en.wikipedia.org/wiki/Mandelbrot_set)Martin Rees (UK Astronomer Royal) takes on complexity and emergence. This is essentially a primer on why complex systems – as he says, accounting for 99% of what’s interesting about the world – are not susceptible to reductionist science despite being, at some level, reducible to physics. As he rightly puts it, “reductionism is true in a sense. But it’s seldom true in a useful sense.” Rees’s explanations are a bit clumsy in places – for instance, he confuses ‘complicated’ with ‘complex’ once or twice, which is a rooky mistake, and his example of the Mandelbrot Set as ‘incomprehensible’ is not convincing and rather misses the point about why emergent systems cannot be usefully explained by reductionism (it’s about different kinds of causality, not about complicated patterns) – but he generally provides a good introduction to the issues.

These are well-trodden themes that most complexity theorists have addressed in far more depth and detail, and that usually appear in the first chapter of any introductory book in the field, but it is good to see someone who, from his job title, might seem to be an archetypal reductive scientist (he’s an astrophysicist) challenging some of the basic tenets of his discipline.

Perhaps my favourite works on the subject are John Holland’s Signals and Boundaries, which is a brilliant, if incomplete, attempt to develop a rigorous theory to explain and describe complex adaptive systems, and Stuart Kauffman’s flawed but stunning Reinventing the Sacred, which (with very patchy success) attempts to bridge science and religious belief but that, in the process, brilliantly and repeatedly proves, from many different angles, the impossibility of reductive science explaining or predicting more than an infinitesimal fraction of what actually matters in the universe. Both books are very heavy reading, but very rewarding.

Address of the bookmark: https://aeon.co/ideas/black-holes-are-simpler-than-forests-and-science-has-its-limits

Originally posted at: https://landing.athabascau.ca/bookmarks/view/2874665/black-holes-are-simpler-than-forests-and-science-has-its-limits

Cocktails and educational research

A lot of progress has been made in medicine in recent years through the application of cocktails of drugs. Those used to combat AIDS are perhaps the most well-known, but there are many other applications of the technique to everything from lung cancer to Hodgkin’s lymphoma. The logic is simple. Different drugs attack different vulnerabilities in the pathogens etc they seek to kill. Though evolution means that some bacteria, viruses or cancers are likely to be adapted to escape one attack, the more different attacks you make, the less likely it will be that any will survive.

Simulated learningUnfortunately, combinatorial complexity means this is not a simply a question of throwing a bunch of the best drugs of each type together and gaining their benefits additively. I have recently been reading John H. Miller’s ‘A crude look at the whole: the science of complex systems in business, life and society‘ which is, so far, excellent, and that addresses this and many other problems in complexity science. Miller uses the nice analogy of fashion to help explain the problem: if you simply choose the most fashionable belt, the trendiest shoes, the latest greatest shirt, the snappiest hat, etc, the chances of walking out with the most fashionable outfit by combining them together are virtually zero. In fact, there’s a very strong chance that you will wind up looking pretty awful. It is not easily susceptible to reductive science because the variables all affect one another deeply. If your shirt doesn’t go with your shoes, it doesn’t matter how good either are separately. The same is true of drugs. You can’t simply pick those that are best on their own without understanding how they all work together. Not only may they not additively combine, they may often have highly negative effects, or may prevent one another being effective, or may behave differently in a different sequence, or in different relative concentrations. To make matters worse, side effects multiply as well as therapeutic benefits so, at the very least, you want to aim for the smallest number of compounds in the cocktail that you can get away with. Even were the effects of combining drugs positive, it would be premature to believe that it is the best possible solution unless you have actually tried them all. And therein lies the rub, because there are really a great many ways to combine them.

Miller and colleagues have been using the ideas behind simulated annealing to create faster, better ways to discover working cocktails of drugs. They started with 19 drugs which, a small bit of math shows, could be combined in 2 to the power of 19 different ways – about half a million possible combinations (not counting sequencing or relative strength issues). As only 20 such combinations could be tested each week, the chances of finding an effective, let alone the best combination, were slim within any reasonable timeframe. Simplifying a bit, rather than attempting to cover the entire range of possibilities, their approach finds a local optimum within one locale by picking a point and iterating variations from there until the best combination is found for that patch of the fitness landscape. It then checks another locale and repeats the process, and iterates until they have covered a large enough portion of the fitness landscape to be confident of having found at least a good solution: they have at least several peaks to compare. This also lets them follow up on hunches and to use educated guesses to speed up the search. It seems pretty effective, at least when compared with alternatives that attempt a theory-driven intentional design (too many non-independent variables), and is certainly vastly superior to methodically trying every alternative, inasmuch as it is actually possible to do this within acceptable timescales.

The central trick is to deliberately go downhill on the fitness landscape, rather than following an uphill route of continuous improvement all the time, which may simply get you to the top of an anthill rather than the peak of Everest in the fitness landscape. Miller very effectively shows that this is the fundamental error committed by followers of the Six-Sigma approach to management, an iterative method of process improvement originally invented to reduce errors in the manufacturing process: it may work well in a manufacturing context with a small number of variables to play with in a fixed and well-known landscape, but it is much worse than useless when applied in a creative industry like, say, education, because the chances that we are climbing a mountain and not an anthill are slim to negligible. In fact, the same is true even in manufacturing: if you are just making something inherently weak as good as it can be, it is still weak. There are lessons here for those that work hard to make our educational systems work better. For instance, attempts to make examination processes more reliable are doomed to fail because it’s exams that are the problem, not the processes used to run them. As I finish this while listening to a talk on learning analytics, I see dozens of such examples: most of the analytics tools described are designed to make the various parts of the educational machine work ‘ better’, ie. (for the most part) to help ensure that students’ behaviour complies with teachers’ intent. Of course, the only reason such compliance was ever needed was for efficient use of teaching resources, not because it is good for learning. Anthills.

This way of thinking seems to me to have potentially interesting applications in educational research. We who work in the area are faced with an irreducibly large number of recombinable and mutually affective variables that make any ethical attempt to do experimental research on effectiveness (however we choose to measure that – so many anthills here) impossible. It doesn’t stop a lot of people doing it, and telling us about p-values that prove their point in more or less scupulous studies, but they are – not to put too fine a point on it – almost always completely pointless.  At best, they might be telling us something useful about a single, non-replicable anthill, from which we might draw a lesson or two for our own context. But even a single omitted word in a lecture, a small change in inflection, let alone an impossibly vast range of design, contextual, historical and human factors, can have a substantial effect on learning outcomes and effectiveness for any given individual at any given time. We are always dealing with a lot more than 2 to the power of 19 possible mutually interacting combinations in real educational contexts. For even the simplest of research designs in a realistic educational context, the number of possible combinations of relevant variables is more likely closer to 2 to the power of 100 (in base 10 that’s  1,267,650,600,228,229,401,496,703,205,376). To make matters worse, the effects we are looking for may sometimes not be apparent for decades (having recombined and interacted with countless others along the way) and, for anything beyond trivial reductive experiments that would tell us nothing really useful, could seldom be done at a rate of more than a handful per semester, let alone 20 per week. This is a very good reason to do a lot more qualitative research, seeking meanings, connections, values and stories rather than trying to prove our approaches using experimental results. Education is more comparable to psychology than medicine and suffers the same central problem, that the general does not transfer to the specific, as well as a whole bunch of related problems that Smedslund recently coherently summarized. The article is paywalled, but Smedlund’s abstract states his main points succinctly:

“The current empirical paradigm for psychological research is criticized because it ignores the irreversibility of psychological processes, the infinite number of influential factors, the pseudo-empirical nature of many hypotheses, and the methodological implications of social interactivity. An additional point is that the differences and correlations usually found are much too small to be useful in psychological practice and in daily life. Together, these criticisms imply that an objective, accumulative, empirical and theoretical science of psychology is an impossible project.”

You could simply substitute ‘education’ for ‘psychology’ in this, and it would read the same. But it gets worse, because education is as much about technology and design as it is about states of mind and behaviour, so it is orders of magnitude more complex than psychology. The potential for invention of new ways of teaching and new states of learning is essentially infinite. Reductive science thus has a very limited role in educational research, at least as it has hitherto been done.

But what if we took the lessons of simulated annealing to heart? I recently bookmarked an approach to more reliable research suggested by the Christensen Institute that might provide a relevant methodology. The idea behind this is (again, simplifying a bit) to do the experimental stuff, then to sweep the normal results to one side and concentrate on the outliers, performing iterations of conjectures and experiments on an ever more diverse and precise range of samples until a richer, fuller picture results. Although it would be painstaking and longwinded, it is a good idea. But one cycle of this is a bit like a single iteration of Miller’s simulated annealing approach, a means to reach the top of one peak in the fitness landscape, that may still be a low-lying peak. However if, having done that, we jumbled up the variables again and repeated it starting in a different place, we might stand a chance of climbing some higher anthills and, perhaps, over time we might even hit a mountain and begin to have something that looks like a true science of education, in which we might make some reasonable predictions that do not rely on vague generalizations. It would either take a terribly long time (which itself might preclude it because, by the time we had finished researching, the discipline will have moved somewhere else) or would hit some notable ethical boundaries (you can’t deliberately mis-teach someone), but it seems more plausible than most existing techniques, if a reductive science of education is what we seek.

To be frank, I am not convinced it is worth the trouble. It seems to me that education is far closer as a discipline to art and design than it is to psychology, let alone to physics. Sure, there is a lot of important and useful stuff to be learned about how we learn: no doubt about that at all, and a simulated annealing approach might speed up that kind of research. Painters need to know what paints do too. But from there to prescribing how we should therefore teach spans a big chasm that reductive science cannot, in principle or practice, cross. This doesn’t mean that we cannot know anything: it just means it’s a different kind of knowledge than reductive science can provide. We are dealing with emergent phenomena in complex systems that are ontologically and epistemologically different from the parts of which they consist. So, yes, knowledge of the parts is valuable, but we can no more predict how best to teach or learn from those parts than we can predict the shape and function of the heart from knowledge of cellular organelles in its constituent cells. But knowledge of the cocktails that result – that might be useful.