Administered by the OECD, PISA is basically a set of tests, adapted to each country, that attempt to measure educational performance across a range of skills in order to rank educational systems around the world. The rankings really matter to many countries, and help to determine educational policies across the planet, being especially impactful when countries don’t do well. Often, a low PISA score triggers educational reform (not always ending well), but sometimes countries just stop playing the game. India, for instance, dropped out a decade ago after coming second from the bottom, complaining of lack of adaptation to the Indian context (which is totally fair – India is incredibly diverse, so one measure absolutely does not fit all) though it will be back again next year. There are many reasons to dislike PISA, but the one I want to highlight here is Goodhart’s Law that, when a measure becomes a target, it ceases to be a good measure.
This article – a report on an interview with Andreas Schleicher, OECD Director of Education and Skills (a very smart fellow) – provides some useful food for thought. Though it focuses on China as a case in point, the interview is not so much about China’s ‘success’ as it is about PISA and its limitations in general. Among Schleicher’s more interesting insights is the fact that China’s test results came solely from its four most highly developed and economically successful provinces. These are very unrepresentative of the whole. In fact, China replaced Guangdong in its submission this time round because it was blamed for poorer performance last time, suggesting that the Chinese government’s involvement with PISA is far more concerned with appearing effective on the International stage – on presenting a facade – than on actually improving learning. PISA is a test for countries, and some are quite happy to cheat on the test.
In fact, the biggest contributing factor to test results is, of course and as always, economic. Schleicher notes that, worldwide, the top 10% socioeconomically advantaged students have for at least 10 years consistently outperformed the 10% most disadvantaged students in reading by 141 score points, which equates to approximately three year’s worth of schooling. It is not news that by far the most productive way to improve the effectiveness of educational systems would be to diminish wealth inequalities. It is, though, worth noting that schools play a relatively small role in Chinese education, especially among more prosperous families, with vast amounts of (paid, private) tuition occurring outside schools. Similar extracurricular tuition patterns occur in several of the other highest ranking PISA countries, such as South Korea, Singapore, Japan, Hong Kong, and Taiwan. It is significant that, in these countries, test scores are extremely important in almost every way – economically, culturally, socially, and more – so there is a lot of teaching focused on test results at the expense of almost everything else.
It is also notable – and almost certainly a direct consequence of tests’ importance – that over 80% of Chinese students admit to cheating, which might be more than a minor contributor to the good results. In fairness, cheating rates for the US and Canada are also not too far short of that, correctly implying a serious endemic malaise with our educational systems worldwide (Goodhart’s Law, again), so this is just a relatively slight difference of degree, not of kind. Given the large amount of time spent learning outside school, the high levels of cheating, and the cherry-picking of top performing provinces, the implications are that, far from having a world-leading education system, teaching in China is actually really awful, on average. Among the things that can be gleaned from PISA results are that China performs very badly on productivity (points per hour of learning), and ranks 8th from bottom on life satisfaction for students. It is essentially a failure, by any reasonable measure. The PISA ranking is not quite a fiction but it is close. At least in the case of the high overall placing of China, it certainly fails to correctly measure the effectiveness of the educational system, if results are taken at face value.
There appear to be two distinct patterns among those countries that consistently achieve high PISA results, that appear to divide along broadly cultural lines. The first group includes the likes of China, South Korea, Japan, and Taiwan (all quite notable examples of what Hofsteder describes as collectivist cultures), with high levels of out-of-school tuition, a strong educational emphasis on test scores, and great personal penalties for failure. These countries seem to achieve their high ranking by a very strong focus on passing the tests, with high penalties for failure and great significance for success. As a consequence, their educational systems cannot be seen as standalone causes but, rather, as creators of problems that have to be overcome by other means (most notably in the form of extra-curricular assistance that funds a booming personal tuition economy). Standard bearers for the other main pattern are Finland and Estonia, as well as Switzerland, and Canada (though the latter two devolve educational responsibility to canton/province, so they are less consistently successful in the rankings). In Hofstede’s terms, these are more individualist societies. In this group, test scores (slightly) tend to be seen as a measure of only one of several consequences of teaching, rather than being the primary motivation for doing it. I am certainly culturally biased, but I cannot help but think this is a better way of going about the process: education is for society, much more than for the individual, and certainly not for economic gain, so it must be understood across many dimensions of value. Whether they agree with me or not, I am almost certain that most educators everywhere would like to think that education is about much more than achieving good test scores. It is only a matter of degree, though. Education in all countries I am aware of relies on extrinsic motivation, and there are large pockets of excellence in the first group and large pockets of awfulness in the second. Averages are a stupid way to evaluate a whole country’s educational system, and they conceal great diversity. The boundaries are also blurred. Estonia, for instance, that is singled out in the article as a success story due to its rapid rise through the rankings, actually also makes extensive use of extra tuition in the form of ‘long day groups’ that take place in schools after curricular instruction. Estonia is no worse than most other countries in this regard, and in some ways superior because such long day groups take the place of at least some of the homework that is widely required in many countries, despite a singular lack of evidence that (on average) it has more than a tiny effect on learning. At least Estonia’s approach involves a modicum of good education theory and evidence to support it.
Overall, I think the main thing that is revealed by the PISA process is that average test scores are, for the most part, an extremely poor means of comparing education systems. Given that it is useful for a government to know how their policies are working, there does need to be some way for them to observe how schools are doing, but it would seem more sensible to rely on trained inspectors reviewing schools, their teaching, the work of children, etc, than on test scores. At the very least they should be considering signs of happiness, motivation, community, and social achievement at least as much as academic achievement. However, Goodhart’s Law would cause its usual harm if such things became the dominant measures of success, and more than the lightest of inspections would normally cause more harm than good. I experienced something not too far removed from this (in the form of OFSTED inspections) in the UK as a parent and school governor back in the 1990s. The results were not pretty. For about a year leading up to them teachers’ workloads were massively strained by the need to report on everything, students suffered, resentments piled up, everyone suffered. Though OFSTED reports did sometimes lead to improvements in particularly bad schools, the effects on the vast majority of schools (and especially on teachers) were disastrous, often radically disrupting work, increasing stress levels beyond reasonable bounds, and leading to more than a few resignations and early retirements from the best, most dedicated teachers who could barely cope with the workloads at the best of times. They were forced to become bureaucrats, which is a role to which teachers tend to be very poorly suited. It was (and, I believe, may still be) beyond stupid, despite best intentions.
What is really needed is something more collegial, that is focused on improvement rather than judgment, that celebrates and builds on success rather than amplifying failure, where everyone involved in the process benefits and no one suffers. The whole point (as far as I understand it) is to improve what we do, not to blame those who fail. Appreciative Inquiry is a good start. Simple things like peer observation (with no penalties, no judgments, just formative commentary) can be more than adequate for the most part at a local level, and are beneficial to both observer and observed. Maybe – if someone thinks it necessary – inspectors (volunteers, perhaps, from the teaching profession) could look at samples of student work from further afield with a similarly positive, formative attitude. It might not provide numbers to compare but, if there were enough of a culture of sharing across the whole sector, and if inspectors came from across the geographical and cultural spectrum, it ought to be good enough to improve practice, and to spread good ideas around, so the intent would be achieved. Governments could receive reports on what actually matters – that things are getting better – rather than on what does not (that things are bad, according to some unreliable measurement that compares nothing of any real value to educators, students, or society). Teaching is a deeply soft technology that cannot be reductively simplified to a relationship of entailment. It can, though, as a lived, creative, social process, be improved. This should be the goal of all teachers, and of all those who can influence the process, including governments. PISA only achieves such results in a tiny minority of extreme cases. For the most part, it actively militates against them because it substitutes education – in all its rich complexity – for test scores. These are not even a passable proxy. They are a gross distortion, an abhomination that can trivially be turned to evil, self-serving purposes without in any way improving learning. Schleicher fully understands this. I wish that the people who his organization serves did too.
Originally posted at: https://landing.athabascau.ca/bookmarks/view/5209267/is-china-really-the-educational-powerhouse-that-the-pisa-rankings-suggest-tldr-not-even-close