Can GPT-3 write an academic paper on itself, with minimal human input?

Brilliant. The short answer is, of course, yes, and it doesn’t do a bad job of it. This is conceptual art of the highest order.

This is the preprint of a paper written by GPT-3 (as first author) about itself, submitted to “a well-known peer-reviewed journal in machine intelligence”. The second and third authors provided guidance about themes, datasets, weightings, etc, but that’s as far as it goes. They do provide commentary as the paper progresses, but they tried to keep that as minimal as needed, so that the paper could stand or fall on its own merits. The paper is not too bad. A bit repetitive, a bit shallow, but it’s just a 500 word paper- hardly even an extended abstract – so that’s about par for the course. The arguments and supporting references are no worse than many I have reviewed, and considerably better than some. The use of English is much better than that of the majority of papers I review.

In an article about it in Scientific American the co-authors describe some of the complexities in the submission process. They actually asked GPT-3 about its consent to publication (it said yes), but this just touches the surface of some of the huge ethical, legal, and social issues that emerge. Boy there are a lot of those! The second and third authors deserve a prize for this. But what about the first author? Well, clearly it does not, because its orchestration of phenomena is not for its own use, and it is not even aware that it is doing the orchestration. It has no purpose other than that of the people training it. In fact, despite having written a paper about itself, it doesn’t even know what ‘itself’ is in any meaningful way. But it raises a lot of really interesting questions.

It would be quite interesting to train GPT-3 with (good) student assignments to see what happens. I think it would potentially do rather well. If I were an ethically imperfect, extrinsically-driven student with access to this, I might even get it to write my assignments for me. The assignments might need a bit of tidying here and there, but the quality of prose and the general quality of the work would probably result in a good B and most likely an A, with very little extra tweaking. With a bit more training it could almost certainly mimic a particular student’s style, including all the quirks that would make it seem more human. Plagiarism detectors wouldn’t stand a chance, and I doubt that many (if any) humans would be able to say with any assurance that it was not the student’s own work.

If it’s not already happening, this is coming soon, so I’m wondering what to do about it. I think my own courses are slightly immune thanks to the personal and creative nature of the work and big emphasis on reflection in all of them (though those with essays would be vulnerable), but it would not take too much ingenuity to get GPT-3 to deal with that problem, too: at least, it could greatly reduce the effort needed. I guess we could train our own AIs to recognize the work of other AIs, but that’s an arms war we’d never be able to definitively win. I can see the exam-loving crowd loving this, but they are in another arms war that they stopped winning long ago – there’s a whole industry devoted to making cheating in exams pay, and it’s leaps ahead of the examiners, including those with both online and in-person proctors. Oral exams, perhaps? That would make it significantly more difficult (though far from impossible) to cheat. I rather like the notion that the only summative assessment model that stands a fair chance of working is the one with which academia began.

It seems to me that the only way educators can sensibly deal with the problem is to completely divorce credentialling from learning and teaching, so there is no incentive to cheat during the learning process. This would have the useful side-effect that our teaching would have to be pretty good and pretty relevant, because students would only come to learn, not to get credentials, so we would have to focus solely on supporting them, rather than controlling them with threats and rewards. That would not be such a bad thing, I reckon, and it is long overdue. Perhaps this will be the catalyst that makes it happen.

As for credentials, that’s someone else’s problem. I don’t say that because I want to wash my hands of it (though I do) but because credentialling has never had anything whatsoever to do with education apart from in its appalling inhibition of effective learning. It only happens at the moment because of historical happenstance, not because it ever made any pedagogical sense. I don’t see why educators should have anything to do with it. Assessment (by which I solely mean feedback from self or others that helps learners to learn – not grades!) is an essential part of the learning and teaching process, but credentials are positively antagonistic to it.

Originally posted at: https://landing.athabascau.ca/bookmarks/view/14216255/can-gpt-3-write-an-academic-paper-on-itself-with-minimal-human-input

Joyful assessment: beyond high-stakes testing

Here are my slides from my presentation at the Innovate Learning Summit yesterday. It’s not world-shattering stuff – just a brutal attack on proctored, unseen written exams (PUWEs, pronounced ‘pooies’), followed by a description of the rationale, process, benefits, and unwanted consequences behind the particular portfolio-based approach to assessment employed in most of my teaching. It includes a set of constraints that I think are important to consider in any assessment process, grouped into pedagogical, motivational, and housekeeping (mainly relating to credentials) clusters. I list 13 benefits of my approach relating to each of those clusters, which I think make a pretty resounding case for using it instead of traditional assignments and tests. However, I also discuss outstanding issues, most of which relate to the external context and expectations of students or the institution, but a couple of which are fairly fundamental flaws (notably the extreme importance of prompt, caring, helpful instructor/tutor engagement in making it all work, which can be highly problematic when it doesn’t happen) that I am still struggling with.

Over two dozen people with ties to India’s $1-billion exam scam have died mysteriously in recent months

“… the scale of the scam in the central state of Madhya Pradesh is mind-boggling. Police say that since 2007, tens of thousands of students and job aspirants have paid hefty bribes to middlemen, bureaucrats and politicians to rig test results for medical schools and government jobs.

So far, 1,930 people have been arrested and more than 500 are on the run. Hundreds of medical students are in prison — along with several bureaucrats and the state’s education minister. Even the governor has been implicated.

A billion-dollar fraud scheme, perhaps dozens murdered, nearly 2000 in jail and hundreds more on the run. How can we defend a system that does this to people? Though opportunities for corruption may be higher in India, it is not peculiar to the culture. It is worth remembering that more than two-thirds of high school Canadian students cheat (I have seen some estimates that are notably higher – this was just the first in the search results and illustrates the point well enough):

According to a survey of Canadian university & college students:

  • Cheated on written work in high school 73%
  • Cheated on tests in high school 58%
  • Cheated on a test as undergrads 18%
  • Helped someone else cheat on a test 8%

According to a survey of 43,000 U.S. high school students:

  • Used the internet to plagiarize 33%
  • Cheated on a test last year 59%
  • Did it more than twice 34%
  • Think you need to cheat to get ahead 39%

Source: http://www.cbc.ca/manitoba/features/universities/

When it is a majority phenomenon, this is the moral norm, not an aberration. The problem is a system that makes this a plausible and, for many, a preferable solution, despite knowing it is wrong. This means the system is flawed, far more than the people in it. The problems emerge primarily because, in the cause of teaching, we make people do things they do not want to do, and threaten them/reward them to enforce compliance. It’s not a problem with human nature, it’s a rational reaction to extrinsic motivation, especially when the threat is as great as we make it. Even my dog cheats under those conditions if she can get away with it.  When the point of learning is the reward, then there is no point to learning apart from the reward and, when it’s to avoid punishment, it’s even worse. The quality of learning is always orders of magnitude lower than when we learn something because we want to learn it, or as a side-effect of doing something that interests us, but the direct consequence of extrinsic motivation is to sap away intrinsic motivation, so even those with an interest mostly have at least some of it kicked or cajolled out of them. That’s a failure on a majestic scale. If tests given in schools and universities had some discriminatory value it might still be justifiable but perhaps the dumbest thing of all about the whole crazy mess is that a GPA has no predictive value at all when it comes to assessing competence.

Address of the bookmark: http://www.theprovince.com/health/Over+dozen+people+with+ties+India+billion+exam+scam+have+died/11191722/story.html

Exam focus damaging pupils' mental health, says NUT – BBC News

A report on a survey of 8,000 teachers and a review of the research.

The report sponsors observe…

“Many of the young people Young Minds works with say that they feel completely defined by their grades and that this is very detrimental to their wellbeing and self-esteem.”

It seems that at least some of their teachers do indeed (reluctantly) define them that way…

One junior school teacher said: “I am in danger of seeing them more in terms of what colour they are in my pupils’ list eg are they red (below expectation), green (above expectation) or purples (Pupil Premium) – rather than as individuals.”

Indeed, it appears to be endemic…

Kevin Courtney, deputy general-secretary of the NUT, said: “Teachers at the sharp end are saying this loud and clear, ‘If it isn’t relevant to a test then it is not seen as a priority.’

“The whole culture of a school has become geared towards meeting government targets and Ofsted expectations. As this report shows, schools are on the verge of becoming ‘exam factories’.”

He argued the accountability agenda was “damaging children’s experience of education”, which should be joyful and leave them with “a thirst for knowledge for the rest of their lives”.

This is terrible and tragic. So surely the British government is trying to do something about it? Not so much…

A Department for Education spokesperson said: “Part of our commitment to social justice is the determination to ensure every child is given an education that allows them realise their potential.

“That’s why we are raising standards with a rigorous new curriculum, world class exams and new accountability system that rewards those schools which help every child to achieve their best.”

Helping people to realise their potential is a noble aim. A “rigorous new curriculum, world class exams and new accountability system” is a guaranteed way to prevent that from happening. Duh. Didn’t those that run the UK government learn anything in their expensive private schools? Oh…

Address of the bookmark: http://www.bbc.co.uk/news/education-33380155

The death of the exam: Canada is at the leading edge of killing the dreaded annual ‘final’ for good | National Post

Good news!

There’s not much to disagree with in this article, that reports on some successful efforts to erode the monstrously ugly blight of exams in Canada and beyond, and some of the more obvious reasoning behind the initiatives to kill them. They don’t work, they’re unfair, they’re antagonistic to learning, they cause pain, etc. All true.

Address of the bookmark: http://news.nationalpost.com/news/canada/the-death-of-the-exam-canada-is-at-the-leading-edge-of-killing-the-final-for-good

What exams have taught me

http://community.brighton.ac.uk/jd29/weblog/45251.html

I have argued at some length on numerous occasions that exams, especially in their traditional unseen, time-limited, paper-based form, without access to books or Internet or friends, are the work of the devil and fundamentally wrong in almost every way that I can think of. They are unfair, resource-intensive, inauthentic, counter-productive, anti-educational, disspiriting, soulless products of a mechanistic age that represent an ethos that we should condemn as evil.

And yet they persist.

I have been wondering why something so manifestly wrong should maintain such a hold on our educational system even though it is demonstrably anti-educational. Surely it must be more than a mean-spirited small-minded attempt to ensure that people are who they say they are?

I think I have the answer.

Exams are so much a part of our educational system that pervade almost every subject area that they teach a deeper, more profound set of lessons than any of the subjects that they relate to. Clearly, from their ubiquity, they must relate to more important and basic things to learn than, say, maths, languages, or history. Subjects may come and subjects may go but the forms of assessment remain startlingly constant. So, I have been thinking about what exams taught me:

  • that slow, steady, careful work is not worth the hassle – a bit of cramming (typically one-three days seemed to work for me) in a mad rush just before the event works much more effectively and saves a lot of time
  • the corollary – adrenalin is necessary to achieve anything worth achieving
  • that the most important things in life generally take around three hours to complete
  • that extrinsic motivation, the threat of punishment and the lure of reward, is more important than making what we do fun, enjoyable and intrinsically rewarding
  • that we are judged not on what we achieve or how we grow but on how well we can display our skills in an intense, improbably weird and disconcerting setting

I learnt to do exams early in life better than I learnt most of the subjects I was examined on and have typically done far better than I deserve in such circumstances. I have learnt my lessons well in real life. I (mostly) hit deadlines with minutes to spare and seldom think about them more than a day or two in advance. I perform fairly well in adrenalin-producing circumstances. I summarise and display knowledge that I don’t really have to any great extent. I extemporise. I do things because I fear punishment or crave reward. I play to the rules even when the rules are insane. A bit of high blood pressure comes with the territory. Sometimes this is really useful but I am really trying hard to get out of the habit of always working this way and tp adopt some other approaches sometimes.

There are many other lessons that our educational systems teach us beyond the subject matter – I won’t even begin to explore what we learn from sitting in rows, staying quiet and listening to an authority figure tell us things but, suffice it to say, I haven’t retained much knowledge of grammar, calculus, geography or technical drawing, but I am still unlearning attitudes and beliefs that such bizarre practices instilled in me.

Assessment is good. Assessment tells us how we are doing, where we need to try new things, different approaches, as well as what we are doing right. Assessment is a vital part of the learning process, whether we do it ourselves or get feedback from others (both is best). But assessment should not be the goal. Assessment is part of the process.

Accreditation is good too. Accreditation tells the world that we can do what we claim we can do. it is important that there are ways to verify to others that we are capable (most obviously in the case of people on whom others depend greatly such as surgeons, bus drivers and university professors). Except in cases where the need to work under enormous pressure in unnatural conditions is a prerequisite (there are some occasions) I would just prefer that we relied on authentic evidence rather than this frighteningly artificial process that tells us very little about how people actually perform in the task domain that they are learning in.

The biggest problem comes when we combine and systematise assessment and accreditation into an industrialised, production-line approach to education, losing sight of the real goals. There are many other ways to do this that are less harmful or even positively useful (e.g. portfolios, evidence-based assessment, even vivas when done with care and genuine dialogue) and many are actually used in higher education. We just need more of them to redress the balance a bit.