Recently, there has been a rise in the development of powerful pre-trained natural language models, including GPT-2, Grover, and XLM. These models have shown state-of-the-art capabilities towards a variety of different NLP tasks, including question answering, content summarisation, and text generation. Alongside this, there have been many studies focused on online authorship attribution (AA). That is, the use of models to identify the authors of online texts. Given the power of natural language models in generating convincing texts, this paper examines the degree to which these language models can generate texts capable of deceiving online AA models. Experimenting with both blog and Twitter data, we utilise GPT-2 language models to generate texts using the existing posts of online users. We then examine whether these AI-based text generators are capable of mimicking authorial style to such a degree that they can deceive typical AA models. From this, we find that current AI-based text generators are able to successfully mimic authorship, showing capabilities towards this on both datasets. Our findings, in turn, highlight the current capacity of powerful natural language models to generate original online posts capable of mimicking authorial style sufficiently to deceive popular AA methods; a key finding given the proposed role of AA in real world applications such as spam-detection and forensic investigation.
翻译:最近,开发了强大的经过事先训练的自然语言模型,包括GPT-2、Grover和XLM。这些模型展示了最先进的能力,以完成各种不同的NLP任务,包括问答、内容总和和文本生成。此外,还进行了许多侧重于在线作者归属的研究。这就是,使用模型来确定在线文本的作者。鉴于自然语言模型在产生令人信服的文本方面的力量,本文件审查了这些语言模型能够产生能够欺骗在线 AA模型的文本的程度。用博客和Twitter数据进行实验,我们利用GPT-2语言模型来利用现有在线用户的邮政生成文本。然后我们研究这些基于AI的文本生成者是否有能力模拟作者风格,以至于能够欺骗典型的AA模型。我们从中发现,目前基于AI的文本生成者能够成功地模仿作者,在这两个数据集中展示了这样做的能力。我们的调查结果反过来强调了当前强大的自然语言模型的能力,以利用现有的GPT语言模型的能力,以便利用现有的在线应用功能来生成具有欺骗力的A型模型。我们随后检查了原始的关键的A模型。我们发现,因此提出了一种真正的在线搜索方法。