Probabilistic text generators have been used to produce fake scientific papers for more than a decade. Such nonsensical papers are easily detected by both human and machine. Now more complex AI-powered generation techniques produce texts indistinguishable from that of humans and the generation of scientific texts from a few keywords has been documented. Our study introduces the concept of tortured phrases: unexpected weird phrases in lieu of established ones, such as 'counterfeit consciousness' instead of 'artificial intelligence.' We combed the literature for tortured phrases and study one reputable journal where these concentrated en masse. Hypothesising the use of advanced language models we ran a detector on the abstracts of recent articles of this journal and on several control sets. The pairwise comparisons reveal a concentration of abstracts flagged as 'synthetic' in the journal. We also highlight irregularities in its operation, such as abrupt changes in editorial timelines. We substantiate our call for investigation by analysing several individual dubious articles, stressing questionable features: tortured writing style, citation of non-existent literature, and unacknowledged image reuse. Surprisingly, some websites offer to rewrite texts for free, generating gobbledegook full of tortured phrases. We believe some authors used rewritten texts to pad their manuscripts. We wish to raise the awareness on publications containing such questionable AI-generated or rewritten texts that passed (poor) peer review. Deception with synthetic texts threatens the integrity of the scientific literature.
翻译:10多年来,人们一直使用概率文字生成器来制作假科学论文。这种非感知性的论文很容易被人类和机器所发现。现在,更为复杂的AI-动力生成技术产生了与人类的文本无法区分的文本,从几个关键字中产生了科学文本,已经记录下来。我们的研究引入了酷刑语句的概念:用“反假意识”取代既定词句的奇特怪异词句,例如“假意识”而不是“人工智能”。我们为酷刑的词句梳理了文献,并研究了一个这些词语集中在质量上的著名期刊。假冒了我们使用先进语言模型的情况。我们用该期刊和若干控制组的近期文章摘要制作了一个探测器。配对式比较显示在期刊中标为“合成”摘要的集中。我们还强调了其操作中的不正常之处,例如编辑时间表的突然变化。我们通过分析一些可疑的个人文章来证实我们的调查呼吁,强调可疑的特征:酷刑的写作风格,引用非存在的文献,以及未经承认的图像再利用的文本。我们怀疑的是,我们有些网站将一些经书本用于免费修改的文本,我们相信这些经的版本。