It’s all solid stuff that supports much of what I and many others have written about the value of belongingness and social interaction in learning but, like much research in fields such as psychology, education, sociology, and so on, it makes some seemingly innocuous but fundamentally wrong assertions of fact. For instance:
“Those who were instructed to strike up a conversation with someone new on public transport or with their cab driver reported a more positive commute experience than those instructed to sit in silence.”
What, all of them? That seems either unbelievably improbable, or the result of a flawed methodology, or a sign of way too small a sample size. The paper itself is inaccessibly paywalled so I don’t know for sure, but I suspect this is actually just a sloppy description of the findings. It is not the result of bad reporting in the Quartz article, though: it is precisely what the abstract of the paper itself actually claims. The researchers make several similar claims like “Those who were instructed to strike up a hypothetical conversation with a stranger said they expected a negative experience as opposed to just sitting alone.” Again – all of them? If that were true, no one would ever talk to strangers (which anyone that has ever stood in a line-up in Canada knows to be not just false but Trumpishly false), so this is either a very atypical group or a very misleading statement about group members’ behaviours. The findings are likely, on average, correct for the groups studied, but that’s not the way it is written.
The article is filled with similarly dubious quotes from distinguished researchers and, worse, pronouncements about what we should do as a result. Often the error is subtly couched in (accurate but misleadingly ambiguous) phrasing like “The group that engaged in friendly small talk performed better in the tests.” I don’t think it is odd to carelessly read that as ‘all of the individuals in the group performed better than all of those in the other groups’, rather than that, ‘on average, the collective group entity performed better than another collective group entity’, which is what was actually meant (and that is far less interesting). From there it is an easy – but dangerously wrong – step to claim that ‘if you engage in small talk then you will experience cognitive gains.’ It’s natural to want to extrapolate a general law from averaged behaviours, and in some domains (where experimental anomalies can be compellingly explained) it makes sense, but it’s wrong in most cases, especially when applied to complex systems like, say, anything involving the behaviour of people.
It’s a problem because, like most in my profession, I regularly use such findings to guide my own teaching. On average, results are likely (but far from certain) to be better than if I did not use them, but definitely not for everyone, and certainly not every time. Students do tend to benefit from engagement with other students, sure. It’s a fair heuristic, but there are exceptions, at least sometimes. And the exceptions aren’t just a statistical anomaly. These are real people we are talking about, not average people. When I do teaching well – nothing like enough of the time – I try to make it possible for those that aren’t average to do their own thing without penalty. I try to be aware of differences and cater for them. I try to enable those that wish it to personalize their own learning. I do this because I’ve never in my entire life knowingly met an average person.
Unfortunately, our educational systems really don’t help me in my mission because they are pretty much geared to cater for someone that probably doesn’t exist. That said, the good news is that there is a general trend towards personalized learning that figures largely in most institutional plans. The bad news is that (as Alfie Kohn brilliantly observes) what is normally meant by ‘personalized’ in such plans is not its traditional definition at all, but instead ‘learning that is customized (normally by machines) for students in order that they should more effectively meet our requirements.’ In case we might have forgotten, personalization is something done by people, not to people.
Further reading: Todd Rose’s ‘End of Average‘ is a great primer on how to avoid the average-to-the-particular trap and many other errors, including why learning styles, personality types, and a lot of other things many people believe to be true are utterly ungrounded, along with some really interesting discussion of how to improve our educational systems (amongst other things). I was gripped from start to finish and keep referring back to it a year or two on.
An interview with me by Graham Allcott, author of the bestselling How to be a productivity ninja and other books, for his podcast series Beyond Busy, and as part of the research for his next book. In it I ramble a lot about issues like social media, collective intelligence, motivation, technology, education, leadership, and learning, and Graham makes some incisive comments and asks some probing questions. The interview was conducted on the landing of the Grand Hotel, Brighton, last year.
A lot of progress has been made in medicine in recent years through the application of cocktails of drugs. Those used to combat AIDS are perhaps the most well-known, but there are many other applications of the technique to everything from lung cancer to Hodgkin’s lymphoma. The logic is simple. Different drugs attack different vulnerabilities in the pathogens etc they seek to kill. Though evolution means that some bacteria, viruses or cancers are likely to be adapted to escape one attack, the more different attacks you make, the less likely it will be that any will survive.
Unfortunately, combinatorial complexity means this is not a simply a question of throwing a bunch of the best drugs of each type together and gaining their benefits additively. I have recently been reading John H. Miller’s ‘A crude look at the whole: the science of complex systems in business, life and society‘ which is, so far, excellent, and that addresses this and many other problems in complexity science. Miller uses the nice analogy of fashion to help explain the problem: if you simply choose the most fashionable belt, the trendiest shoes, the latest greatest shirt, the snappiest hat, etc, the chances of walking out with the most fashionable outfit by combining them together are virtually zero. In fact, there’s a very strong chance that you will wind up looking pretty awful. It is not easily susceptible to reductive science because the variables all affect one another deeply. If your shirt doesn’t go with your shoes, it doesn’t matter how good either are separately. The same is true of drugs. You can’t simply pick those that are best on their own without understanding how they all work together. Not only may they not additively combine, they may often have highly negative effects, or may prevent one another being effective, or may behave differently in a different sequence, or in different relative concentrations. To make matters worse, side effects multiply as well as therapeutic benefits so, at the very least, you want to aim for the smallest number of compounds in the cocktail that you can get away with. Even were the effects of combining drugs positive, it would be premature to believe that it is the best possible solution unless you have actually tried them all. And therein lies the rub, because there are really a great many ways to combine them.
Miller and colleagues have been using the ideas behind simulated annealing to create faster, better ways to discover working cocktails of drugs. They started with 19 drugs which, a small bit of math shows, could be combined in 2 to the power of 19 different ways – about half a million possible combinations (not counting sequencing or relative strength issues). As only 20 such combinations could be tested each week, the chances of finding an effective, let alone the best combination, were slim within any reasonable timeframe. Simplifying a bit, rather than attempting to cover the entire range of possibilities, their approach finds a local optimum within one locale by picking a point and iterating variations from there until the best combination is found for that patch of the fitness landscape. It then checks another locale and repeats the process, and iterates until they have covered a large enough portion of the fitness landscape to be confident of having found at least a good solution: they have at least several peaks to compare. This also lets them follow up on hunches and to use educated guesses to speed up the search. It seems pretty effective, at least when compared with alternatives that attempt a theory-driven intentional design (too many non-independent variables), and is certainly vastly superior to methodically trying every alternative, inasmuch as it is actually possible to do this within acceptable timescales.
The central trick is to deliberately go downhill on the fitness landscape, rather than following an uphill route of continuous improvement all the time, which may simply get you to the top of an anthill rather than the peak of Everest in the fitness landscape. Miller very effectively shows that this is the fundamental error committed by followers of the Six-Sigma approach to management, an iterative method of process improvement originally invented to reduce errors in the manufacturing process: it may work well in a manufacturing context with a small number of variables to play with in a fixed and well-known landscape, but it is much worse than useless when applied in a creative industry like, say, education, because the chances that we are climbing a mountain and not an anthill are slim to negligible. In fact, the same is true even in manufacturing: if you are just making something inherently weak as good as it can be, it is still weak. There are lessons here for those that work hard to make our educational systems work better. For instance, attempts to make examination processes more reliable are doomed to fail because it’s exams that are the problem, not the processes used to run them. As I finish this while listening to a talk on learning analytics, I see dozens of such examples: most of the analytics tools described are designed to make the various parts of the educational machine work ‘ better’, ie. (for the most part) to help ensure that students’ behaviour complies with teachers’ intent. Of course, the only reason such compliance was ever needed was for efficient use of teaching resources, not because it is good for learning. Anthills.
This way of thinking seems to me to have potentially interesting applications in educational research. We who work in the area are faced with an irreducibly large number of recombinable and mutually affective variables that make any ethical attempt to do experimental research on effectiveness (however we choose to measure that – so many anthills here) impossible. It doesn’t stop a lot of people doing it, and telling us about p-values that prove their point in more or less scupulous studies, but they are – not to put too fine a point on it – almost always completely pointless. At best, they might be telling us something useful about a single, non-replicable anthill, from which we might draw a lesson or two for our own context. But even a single omitted word in a lecture, a small change in inflection, let alone an impossibly vast range of design, contextual, historical and human factors, can have a substantial effect on learning outcomes and effectiveness for any given individual at any given time. We are always dealing with a lot more than 2 to the power of 19 possible mutually interacting combinations in real educational contexts. For even the simplest of research designs in a realistic educational context, the number of possible combinations of relevant variables is more likely closer to 2 to the power of 100 (in base 10 that’s 1,267,650,600,228,229,401,496,703,205,376). To make matters worse, the effects we are looking for may sometimes not be apparent for decades (having recombined and interacted with countless others along the way) and, for anything beyond trivial reductive experiments that would tell us nothing really useful, could seldom be done at a rate of more than a handful per semester, let alone 20 per week. This is a very good reason to do a lot more qualitative research, seeking meanings, connections, values and stories rather than trying to prove our approaches using experimental results. Education is more comparable to psychology than medicine and suffers the same central problem, that the general does not transfer to the specific, as well as a whole bunch of related problems that Smedslund recently coherently summarized. The article is paywalled, but Smedlund’s abstract states his main points succinctly:
“The current empirical paradigm for psychological research is criticized because it ignores the irreversibility of psychological processes, the infinite number of influential factors, the pseudo-empirical nature of many hypotheses, and the methodological implications of social interactivity. An additional point is that the differences and correlations usually found are much too small to be useful in psychological practice and in daily life. Together, these criticisms imply that an objective, accumulative, empirical and theoretical science of psychology is an impossible project.”
You could simply substitute ‘education’ for ‘psychology’ in this, and it would read the same. But it gets worse, because education is as much about technology and design as it is about states of mind and behaviour, so it is orders of magnitude more complex than psychology. The potential for invention of new ways of teaching and new states of learning is essentially infinite. Reductive science thus has a very limited role in educational research, at least as it has hitherto been done.
But what if we took the lessons of simulated annealing to heart? I recently bookmarked an approach to more reliable research suggested by the Christensen Institute that might provide a relevant methodology. The idea behind this is (again, simplifying a bit) to do the experimental stuff, then to sweep the normal results to one side and concentrate on the outliers, performing iterations of conjectures and experiments on an ever more diverse and precise range of samples until a richer, fuller picture results. Although it would be painstaking and longwinded, it is a good idea. But one cycle of this is a bit like a single iteration of Miller’s simulated annealing approach, a means to reach the top of one peak in the fitness landscape, that may still be a low-lying peak. However if, having done that, we jumbled up the variables again and repeated it starting in a different place, we might stand a chance of climbing some higher anthills and, perhaps, over time we might even hit a mountain and begin to have something that looks like a true science of education, in which we might make some reasonable predictions that do not rely on vague generalizations. It would either take a terribly long time (which itself might preclude it because, by the time we had finished researching, the discipline will have moved somewhere else) or would hit some notable ethical boundaries (you can’t deliberately mis-teach someone), but it seems more plausible than most existing techniques, if a reductive science of education is what we seek.
To be frank, I am not convinced it is worth the trouble. It seems to me that education is far closer as a discipline to art and design than it is to psychology, let alone to physics. Sure, there is a lot of important and useful stuff to be learned about how we learn: no doubt about that at all, and a simulated annealing approach might speed up that kind of research. Painters need to know what paints do too. But from there to prescribing how we should therefore teach spans a big chasm that reductive science cannot, in principle or practice, cross. This doesn’t mean that we cannot know anything: it just means it’s a different kind of knowledge than reductive science can provide. We are dealing with emergent phenomena in complex systems that are ontologically and epistemologically different from the parts of which they consist. So, yes, knowledge of the parts is valuable, but we can no more predict how best to teach or learn from those parts than we can predict the shape and function of the heart from knowledge of cellular organelles in its constituent cells. But knowledge of the cocktails that result – that might be useful.
Interesting and thoughtful argument from Savage Minds mainly comparing the access models of two well-known anthropology journals, one of which has gone open and seems to be doing fine, the other of which is in dire straits and that almost certainly needs to open up, but for which it may be too late. I like two quotes in particular. The first is from the American Anthropologist’s editorial, explaining the difficulties they are in:
“If you think that making money by giving away content is a bad idea, you should see what happens when the AAA tries to make money selling it. To put it kindly, our reader-pays model has never worked very well. Getting over our misconceptions about open access requires getting over misconceptions of the success of our existing publishing program. The choice we are facing is not that of an unworkable ideal versus a working system. It is the choice between a future system which may work and an existing system which we know does not.”
I like that notion of blurring and believe that this is definitely the way to go. We are greatly in need of new models for the sharing, review, and discussion of academic works because the old ones make no sense any more. They are expensive, untimely, exclusionary and altogether over-populous. There have been many attempts to build dedicated platforms for that kind of thing over they years (one of my favourites being the early open peer-reviewing tools of JIME in the late 1990s, now a much more conventional journal, to its loss). But perhaps one of the most intriguing approaches of all comes not from academic presses but from the world of student newspapers. This article reports on a student newspaper shifting entirely into the (commercial but free) social media of Medium and Twitter, getting rid of the notion of a published newspaper altogether but still retaining some kind of coherent identity. I don’t love the notion of using these proprietary platforms one bit, though it makes a lot of sense for cash-strapped journalists trying to reach and interact with a broad readership, especially of students. Even so, there might be more manageable and more open, persistent ways (eg. syndicating from a platform like WordPress or Known). But I do like the purity of this approach and the general idea is liberating.
It might be too radical an idea for academia to embrace at the moment but I see no reason at all that a reliable curatorial team, with some of the benefits of editorial control, posting exclusively to social media, might not entirely replace the formal journal, for both process and product. It already happens to an extent, including through blogs (I have cited many), though it would still be a brave academic that chose to cite only from social media sources, at least for most papers and research reports. But what if those sources had the credibility of a journal editorial team behind them and were recognized in similar ways, with the added benefit of the innate peer review social media enables? We could go further than that and use a web of trust to assert validity and authority of posts – again, that already occurs to some extent and there are venerable protocols and standards that could be re-used or further developed for that, from open badges to PGP, from trackbacks to WebMention. We are reaching the point where subtle distinctions between social media posts are fully realizable – they are not all one uniform stream of equally reliable content – where identity can be fairly reliably asserted, and where such an ‘unjournal’ could be entirely distributed, much like a Connectivist MOOC. Maybe more so: there is no reason there should even be a ‘base’ site to aggregate it all, as long as trust and identity were well established. It might even be unnecessary to have a name, though a hashtag would probably be worth using.
I wonder what the APA format for such a thing might be?