As artificial intelligence (AI) technologies become increasingly powerful and prominent in society, their misuse is a growing concern. In educational settings, AI technologies could be used by students to cheat on assignments and exams. In this paper we explore whether transformers can be used to solve introductory level programming assignments while bypassing commonly used AI tools to detect similarities between pieces of software. We find that a student using GPT-J [Wang and Komatsuzaki, 2021] can complete introductory level programming assignments without triggering suspicion from MOSS [Aiken, 2000], a widely used software similarity and plagiarism detection tool. This holds despite the fact that GPT-J was not trained on the problems in question and is not provided with any examples to work from. We further find that the code written by GPT-J is diverse in structure, lacking any particular tells that future plagiarism detection techniques may use to try to identify algorithmically generated code. We conclude with a discussion of the ethical and educational implications of large language models and directions for future research.
翻译:由于人工智能(AI)技术在社会中日益强大和突出,其滥用问题日益引起人们的关注。在教育环境中,学生可以使用AI技术在任务和考试上作弊。在本文件中,我们探讨是否可使用变压器解决入门级的编程任务,而绕过常用的AI工具来探测软件的相似之处。我们发现,使用GPT-J[Wang和Komatsuzaki,2021]的学生可以完成入门级的编程任务,而不会引起MOS[Aiken,2000]的怀疑。MOS[Aiken,2000]是一个广泛使用的软件相似性和致病性检测工具。尽管GPT-J没有接受过有关问题的培训,也没有获得任何工作实例。我们进一步发现,GPT-J所编写的代码结构上各不相同,没有具体地说明未来的致病检测技术可能用来试图识别从逻辑学上生成的代码。我们最后讨论了大型语言模型和方向的伦理和教育影响。