Proctored exams have fallen to generative AI

A Turkish university candidate was recently arrested after being caught using an AI-powered system to obtain answers to the entrance exam in real-time.

Source: Student Caught Using Artificial Intelligence to Cheat on University Entrance Test Students wired up to a computer while taking their exams

A couple of years ago (and a few times since) I observed that proctored exams offer no meaningful defence against generative AI so I am a little surprised it has taken so long for someone to be caught doing this. I guess that others have been more careful.

The candidate used a simple and rather obvious set-up: a camera disguised as a shirt button that was used to read the questions, a router hidden in a hollowed-out shoe linking to a stealthily concealed mobile device that queried a generative AI (likely ChatGPT-powered) that fed the answers back verbally to an in-ear bluetooth earpiece. Constructing such a thing would take a little ingenuity but it’s not rocket science. It’s not even computer science. Anyone could do this. It would take some skill to make it work well, though, and that may be the reason this attempt went wrong. The candidate was caught as a result of their suspicious behaviour, not because anyone directly noticed the tech. I’m trying to imagine the interface, how the machine would know which question to answer (did the candidate have to point their button in the right direction?), how they dealt with dictating the answers at a usable speed (what if they needed it to be repeated? Did they have to tap a microphone a number of times?), how they managed sequence and pacing (sub-vocalization? moving in a particular way?). These are soluble problems but they are not trivial, and skill would be needed to make the whole thing seem natural.

It may take a little while for this to become a widespread commodity item (and a bit longer for exam-takers to learn to use it unobtrusively), but I’m prepared to bet that someone is working on it, if it is not already available. And, yes, exam-setters will come up with a counter-technology to address this particular threat (scanners? signal blockers? Forcing students to strip naked?) but the cheats will be more ingenious, the tech will improve, and so it will go on, in an endless and unwinnable arms race.

Very few people cheat as a matter of course. This candidate was arrested – exam cheating is against the law in Turkey – for attempting to solve the problem they were required to solve, which was to pass the test, not to demonstrate their competence. The level of desperation that led to them adopting such a risky solution to the problem is hard to imagine, but it’s easy to understand how high the stakes must have seemed and how strong the incentive to succeed must have been. The fact that, in most societies, we habitually inflict such tests on both children and adults, on an unimaginably vast scale, will hopefully one day be seen as barbaric, on a par with beating children to make them behave. They are inauthentic, inaccurate, inequitable and, most absurdly of all, a primary cause of the problem they are designed to solve. We really do need to find a better solution.

Note on the post title: the student was caught so, as some have pointed out,  it would be an exaggeration to say that this one case is proof that proctored exams have fallen to generative AI, but I think it is a very safe assumption that this is not a lone example. This is a landmark case because it provides the first direct evidence that this is happening in the wild, not because it is the first time it has ever happened.

So, this is a thing…

Students are now using AIs to write essays and assignments for credit, and they are (probably) getting away with it. This particular instance may be fake, but the tools are widely available and it would be bizarre were no one to be using them for this purpose. There are already far too many sites providing stuff like product reviews and news stories (re)written by AIs, and AIs are already being used for academic paper writing. In fact, systems for doing so, like CopyMatic or ArticleGenerator, are now a commodity item. So the next step will be that we will develop AIs to identify the work of other AIs (in fact, that is already a thing, e.g. here and here), and so it will go on, and on, and on.

This kind of thing will usually evade plagiarism checkers with ease, and may frequently fool human markers. For those of us working in educational institutions, I predict that traditionalists will demand that we double down on proctored exams, in a vain attempt to defend a system that is already broken beyond repair. There are better ways to deal with this: getting to know students, making each learning journey (and outputs) unique and personal, offering support for motivated students rather than trying to ‘motivate’ them, and so on. But that is not enough.

I am rather dreading the time when an artificial student takes one of my courses. The systems are probably too slow, quirky, and expensive right now for real-time deep fakes driven by plausible GANs to fool me, at least for synchronous learning, but I think it could already convincingly be done for asynchronous learning, with relatively little supervision.  I think my solution might be to respond with an artificial teacher, into which there has been copious research for some decades, and of which there are many existing examples.

To a significant extent, we already have artificial students, and artificial teachers teaching them. How ridiculous is that? How broken is the system that not only allows it but actively promotes it?

These tools are out there, getting better by the day, and it makes sense for all of us to be using them. As they become more and more ubiquitous, just as we accommodated pocket calculators in the teaching of math, so we will need to accommodate these tools in all aspects of our education. If an AI can produce a plausible new painting in any artist’s style (or essay, or book, or piece of music, or video) then what do humans need to learn, apart from how to get the most out of the machines? If an AI can write a better essay than me, why should I bother? If a machine can teach as well as me, why teach?

This is a wake-up call. Soon, if not already, most of the training data for the AIs will be generated by AIs. Unchecked, the result is going to be a set of ever-worse copies of copies, that become what the next generation consumes and learns from, in a vicious spiral that leaves us at best stagnant, at worst something akin to the Eloi in H.G. Wells’s Time Machine.  If we don’t want this to happen then it is time for educators to reclaim, to celebrate, and (perhaps a little) to reinvent our humanity. We need, more and more, to think of education as a process of learning to be, not of learning to do, except insofar as the doing contributes to our being. It’s about people, learning to be people, in the presence of and through interaction with other people. It’s about creativity, compassion, and meaning, not the achievement of outcomes a machine could replicate with ease. I think it should always have been this way.

Originally posted at: https://landing.athabascau.ca/bookmarks/view/15164121/so-this-is-a-thing

Can GPT-3 write an academic paper on itself, with minimal human input?

Brilliant. The short answer is, of course, yes, and it doesn’t do a bad job of it. This is conceptual art of the highest order.

This is the preprint of a paper written by GPT-3 (as first author) about itself, submitted to “a well-known peer-reviewed journal in machine intelligence”. The second and third authors provided guidance about themes, datasets, weightings, etc, but that’s as far as it goes. They do provide commentary as the paper progresses, but they tried to keep that as minimal as needed, so that the paper could stand or fall on its own merits. The paper is not too bad. A bit repetitive, a bit shallow, but it’s just a 500 word paper- hardly even an extended abstract – so that’s about par for the course. The arguments and supporting references are no worse than many I have reviewed, and considerably better than some. The use of English is much better than that of the majority of papers I review.

In an article about it in Scientific American the co-authors describe some of the complexities in the submission process. They actually asked GPT-3 about its consent to publication (it said yes), but this just touches the surface of some of the huge ethical, legal, and social issues that emerge. Boy there are a lot of those! The second and third authors deserve a prize for this. But what about the first author? Well, clearly it does not, because its orchestration of phenomena is not for its own use, and it is not even aware that it is doing the orchestration. It has no purpose other than that of the people training it. In fact, despite having written a paper about itself, it doesn’t even know what ‘itself’ is in any meaningful way. But it raises a lot of really interesting questions.

It would be quite interesting to train GPT-3 with (good) student assignments to see what happens. I think it would potentially do rather well. If I were an ethically imperfect, extrinsically-driven student with access to this, I might even get it to write my assignments for me. The assignments might need a bit of tidying here and there, but the quality of prose and the general quality of the work would probably result in a good B and most likely an A, with very little extra tweaking. With a bit more training it could almost certainly mimic a particular student’s style, including all the quirks that would make it seem more human. Plagiarism detectors wouldn’t stand a chance, and I doubt that many (if any) humans would be able to say with any assurance that it was not the student’s own work.

If it’s not already happening, this is coming soon, so I’m wondering what to do about it. I think my own courses are slightly immune thanks to the personal and creative nature of the work and big emphasis on reflection in all of them (though those with essays would be vulnerable), but it would not take too much ingenuity to get GPT-3 to deal with that problem, too: at least, it could greatly reduce the effort needed. I guess we could train our own AIs to recognize the work of other AIs, but that’s an arms war we’d never be able to definitively win. I can see the exam-loving crowd loving this, but they are in another arms war that they stopped winning long ago – there’s a whole industry devoted to making cheating in exams pay, and it’s leaps ahead of the examiners, including those with both online and in-person proctors. Oral exams, perhaps? That would make it significantly more difficult (though far from impossible) to cheat. I rather like the notion that the only summative assessment model that stands a fair chance of working is the one with which academia began.

It seems to me that the only way educators can sensibly deal with the problem is to completely divorce credentialling from learning and teaching, so there is no incentive to cheat during the learning process. This would have the useful side-effect that our teaching would have to be pretty good and pretty relevant, because students would only come to learn, not to get credentials, so we would have to focus solely on supporting them, rather than controlling them with threats and rewards. That would not be such a bad thing, I reckon, and it is long overdue. Perhaps this will be the catalyst that makes it happen.

As for credentials, that’s someone else’s problem. I don’t say that because I want to wash my hands of it (though I do) but because credentialling has never had anything whatsoever to do with education apart from in its appalling inhibition of effective learning. It only happens at the moment because of historical happenstance, not because it ever made any pedagogical sense. I don’t see why educators should have anything to do with it. Assessment (by which I solely mean feedback from self or others that helps learners to learn – not grades!) is an essential part of the learning and teaching process, but credentials are positively antagonistic to it.

Originally posted at: https://landing.athabascau.ca/bookmarks/view/14216255/can-gpt-3-write-an-academic-paper-on-itself-with-minimal-human-input

Study links student cheating to whether a course is popular or disliked

examWe already know that extrinsically motivated students (mainly those driven by grades and testing) are far more likely to cheat than those who are more intrinsically motivated. I bookmarked yet another example of this effect just the other day but there are hundreds if not thousands of research papers that confirm this in many different ways. And, as this article reaffirms, we already know that mastery learning approaches (that focus on supporting control, appropriate levels of challenge, and, ideally, social engagement) tend to make cheating far less likely, because they tend to better support intrinsic motivation. Hardly anyone cheats if they are doing stuff they love to do, unless some strong extrinsic force overrides it (like grades, rewards, punishments, hard-to-meet deadlines, etc). 

This research reveals another interesting facet of the problem that exactly accords with what self-determination theory would predict: that, whether or not the pedagogy is sensible (supportive of intrinsic motivation) or dumb (extrinsically driven), a student’s dislike of a course appears to predict an increased likelihood of cheating. This is pretty obvious when you think about it. If someone does not like a course then, by definition, they are not intrinsically motivated and, if they are still taking it despite that, the only motivation they can possibly have left is extrinsic.

The increased chances of cheating on disliked courses, whether or not mastery learning techniques are used, is completely unsurprising because it ain’t what you do, it’s the way that you do it. If mastery learning techniques are not working then it probably means that we are simply not using them very well. Most likely there is not enough support, or not enough learner control, or insufficient social engagement, or not enough/too much challenge, or there’s too much pressure, or something along those lines. It is actually much more difficult and usually far more time consuming to teach well using techniques that respect learner autonomy and individual needs than it is to follow the objectivist instructivist path, at least in an institutional environment that deeply embeds extrinsic motivation at its very core, so it is not surprising that it quite often fails.  It is also very possible that the problem is almost entirely due to the surrounding educational ecosystem. For instance if it is one that forces students down institutionally-determined paths whether or not they are ready, whether or not it matters to them, or if not enough time is allowed for it, or if the stakes for failure are high, then even well-designed courses with enthusiastic, supportive, skilled, well-informed, compassionate, unpressured teachers are not likely to help that much.

Some people will take a pragmatic lesson from this to look more carefully for cheating on courses that they know to be disliked. That’s not the solution. Others will look at those courses and try to find ways to make them more likeable. That’s much better. But really, once we have done that, we need to be wondering about why anyone would be taking a course that they dislike in the first place. And that points to a central problem with our educational systems and the tightly coupled teaching and accreditation that they embed deep in their bones. Given enough time, support, and skilled tuition, almost anyone can learn almost anything, and love doing so. We live in a time of plenty, where there are usually countless resources, people, and methods to learn almost anything, in almost any practical way, so it makes no sense that people should still be forced to learn in ways that they dislike, at inappropriate times, and at an inappropriate pace. If they do, it is because (one way or another) we make them do so, and that’s the root of the problem. We – the educators and, above all, the educational system – are the cause of cheating, as much as we are the victims of it. And we are the ones that should fix it.

The original paywalled paper can be found here.

Address of the bookmark: https://www.insidehighered.com/news/2017/10/06/study-links-student-cheating-whether-course-popular-or-disliked

Originally posted at: https://landing.athabascau.ca/bookmarks/view/2762299/study-links-student-cheating-to-whether-a-course-is-popular-or-disliked

Highly praised children are more inclined to cheat

The title of this Alphr article is a little misleading because the point the article rightly makes is that it all depends on the type of praise given. It reports on research from the University of Toronto that confirms (yet again) what should be obvious: praising learners for who they are (‘you’re so smart’) is a really bad idea, while praising what they do (‘you did that well’) is not normally a bad idea. The issue, though, is essentially one of intrinsic vs extrinsic motivation. By praising the person for being a particular way you are positioning that as the purpose, rather than a side-effect, of the activity, and positioning yourself as the arbiter, so disempowering the learner. By praising the behaviour, you are offering useful feedback on performance that empowers the recipient to choose whether and how to do such things again, as well as supporting needs for relatedness (it shows you care) and competence (it helps them improve). Both forms of praise contribute to feelings of self-esteem, but only one supports intrinsic motivation. 

The nice twist in these particular studies (here and here) is that the researchers were looking at effects on morality. They found that ability praise (teling them they are smart) is very strongly correlated with a propensity to cheat. Exactly as theory would predict, kids who have been told that they are smart are significantly more likely to respond to the extrinsic motivation (the need to live up to expectations when given ability praise) by cheating, when given the opportunity. Interestingly, praising the behaviour (performance praise) has little or no effect on likelihood of cheating when compared with those given no praise at all: it is only when an expectation is set that the children are perceived as smart that cheating behaviour increases. It is also interesting, if tangential, that boys appeared to be way more likely to cheat than girls under all the conditions though, once primed by ability praise, girls were more likely to cheat than boys that had received no praise or performance praise.

The lesson is nothing like as simple as remembering to just praise the action, not the person. Praising behaviours can, when used badly, be just as disempowering as praising the person. For instance, while in some senses it might be possible to view grades as a kind of abbreviated praise (or punishment, which amounts to much the same thing) for a behaviour, there’s a critical difference: the fact that it will be graded is known in advance by the learner. This is compounded by the fact that the grade matters to them, often more than the performance of the activity itself. Thus, achieving the grade becomes the goal, not the consequence of the behaviour, and it reinforces the power of the grader to determine the behaviour of the learner, with a consequent loss of learner autonomy. That shift from intrinsic to extrinsic motivation is the big issue here, not the praise itself. There are lots of ways to give both performance praise and ability praise that are not coercive. They are only harmful when used to manipulate behaviour.

Address of the bookmark: http://www.alphr.com/science/1007043/highly-praised-children-are-more-inclined-to-cheat

Categories Learning Tagged in , , , , , , , Leave a comment

Computer science students should learn to cheat, not be punished for it

This is a well thought-through response to a recent alarmist NYT article about cheating among programming students.

The original NYT article is full of holy pronouncements about the evils of plagiarism, horrified statistics about its extent, and discussions of the arms wars, typically involving sleuthing by markers and evermore ornate technological fixes that are always one step behind the most effective cheats (and one step ahead of the dumber ones). This is a lose-lose system. No one benefits. But that’s not the biggest issue with the article. Nowhere does the NYT article mention that it is largely caused by the fact that we in academia typically tell programming students to behave in ways that no programmer in their right mind would ever behave (disclaimer: the one programming course that I currently teach, very deliberately, does not do that, so I am speaking here as an atypical outlier).

As this article rightly notes, the essence of programming is re-use of code. Although there are certainly egregiously immoral and illegal ways to do that (even open source coders normally need to religiously cite their sources for significant uses of code written by others), applications are built on layer upon layer upon layer of re-used code, common subroutines and algorithms, snippets, chunks, libraries, classes, components, and a thousand different ways to assemble (in some cases literally) the code of others. We could not do programming at all without 99% of the code that does what we want it to do being written by others. Programmers knit such things together, often sharing their discoveries and improvements so that the whole profession benefits and the cycle continues. The solution to most problems is, more often than not, to be found in StackExchange forums, Reddit, or similar sites, or in open source repositories like Github, and it would be an idiotic programmer that chose not to (very critically and very carefully) use snippets provided there. That’s pretty much how programmers learn, a large part of how they solve problems, and certainly how they build stuff. The art of it is in choosing the right snippet, understanding it, fitting it into one’s own code, selecting between alternative solutions and knowing why one is better (in a given context) than another. In many cases, we have memorized ways of doing things so that, even if we don’t literally copy and paste, we repeat patterns (whole lines and blocks) that are often identical to those that we learned from others. It would likely be impossible to even remember where we learned such things, let alone to cite them.  We should not penalize that – we should celebrate it. Sure, if the chunks we use are particulary ingenious, or particularly original, or particularly long, or protected by a licence, we should definitely credit their authors. That’s just common sense and decency, as well as (typically) a legal requirement. But a program made using the code of others is no less plagiarism than Kurt Schwitters was a plagiarist of the myriad found objects that made up his collages, or a house builder is a plagiarist of its bricks.

And, as an aside, please stop calling it ‘Computer Science’. Programming is no more computer science than carpentry is woodworking science. It bugs me that ‘computer science’ is used so often as a drop-in synonym for programming in the popular press, reinforced by an increasing number of academics with science-envy, especially in North America. There are sciences used in computing, and a tiny percentage of those are quite unique to the discipline, but that’s a miniscule percentage of what is taught in universities and colleges, and a vanishingly small percentage of what nearly all programmers actually do. It’s also worth noting that computer science programs are not just about programming: there’s a whole bunch of stuff we teach (and that computing professionals do) about things like databases, networks, hardware, ethics, etc that has nothing whatsoever to do with programming (and little to do with science). Programming, though, especially in its design aspects, is a fundamentally human activity that is creative, situated, and inextricably entangled with its social and organizational context. Apart from in some research labs and esoteric applications, it is normally closer to fine art than it is to science, though it is an incredibly flexible activity that spans a gamut of creative pursuits analogous to a broad range of arts and crafts from poetry to music to interior design to engineering. Perhaps it is most akin to architecture in the ways it can (depending on context) blend art, craft, engineering, and (some) science but it can be analogous to pretty much any creative pursuit (universal machines and all that).

Address of the bookmark: https://thenextweb.com/dd/2017/05/30/lets-teach-computer-science-students-to-cheat/#.tnw_FTOVyGc4

Original page

Over two dozen people with ties to India’s $1-billion exam scam have died mysteriously in recent months

“… the scale of the scam in the central state of Madhya Pradesh is mind-boggling. Police say that since 2007, tens of thousands of students and job aspirants have paid hefty bribes to middlemen, bureaucrats and politicians to rig test results for medical schools and government jobs.

So far, 1,930 people have been arrested and more than 500 are on the run. Hundreds of medical students are in prison — along with several bureaucrats and the state’s education minister. Even the governor has been implicated.

A billion-dollar fraud scheme, perhaps dozens murdered, nearly 2000 in jail and hundreds more on the run. How can we defend a system that does this to people? Though opportunities for corruption may be higher in India, it is not peculiar to the culture. It is worth remembering that more than two-thirds of high school Canadian students cheat (I have seen some estimates that are notably higher – this was just the first in the search results and illustrates the point well enough):

According to a survey of Canadian university & college students:

  • Cheated on written work in high school 73%
  • Cheated on tests in high school 58%
  • Cheated on a test as undergrads 18%
  • Helped someone else cheat on a test 8%

According to a survey of 43,000 U.S. high school students:

  • Used the internet to plagiarize 33%
  • Cheated on a test last year 59%
  • Did it more than twice 34%
  • Think you need to cheat to get ahead 39%

Source: http://www.cbc.ca/manitoba/features/universities/

When it is a majority phenomenon, this is the moral norm, not an aberration. The problem is a system that makes this a plausible and, for many, a preferable solution, despite knowing it is wrong. This means the system is flawed, far more than the people in it. The problems emerge primarily because, in the cause of teaching, we make people do things they do not want to do, and threaten them/reward them to enforce compliance. It’s not a problem with human nature, it’s a rational reaction to extrinsic motivation, especially when the threat is as great as we make it. Even my dog cheats under those conditions if she can get away with it.  When the point of learning is the reward, then there is no point to learning apart from the reward and, when it’s to avoid punishment, it’s even worse. The quality of learning is always orders of magnitude lower than when we learn something because we want to learn it, or as a side-effect of doing something that interests us, but the direct consequence of extrinsic motivation is to sap away intrinsic motivation, so even those with an interest mostly have at least some of it kicked or cajolled out of them. That’s a failure on a majestic scale. If tests given in schools and universities had some discriminatory value it might still be justifiable but perhaps the dumbest thing of all about the whole crazy mess is that a GPA has no predictive value at all when it comes to assessing competence.

Address of the bookmark: http://www.theprovince.com/health/Over+dozen+people+with+ties+India+billion+exam+scam+have+died/11191722/story.html