As the deep learning rapidly promote, the artificial texts created by generative models are commonly used in news and social media. However, such models can be abused to generate product reviews, fake news, and even fake political content. The paper proposes a solution for the Russian Artificial Text Detection in the Dialogue shared task 2022 (RuATD 2022) to distinguish which model within the list is used to generate this text. We introduce the DeBERTa pre-trained language model with multiple training strategies for this shared task. Extensive experiments conducted on the RuATD dataset validate the effectiveness of our proposed method. Moreover, our submission ranked second place in the evaluation phase for RuATD 2022 (Multi-Class).
翻译:随着深层学习的迅速推广,由基因模型创造的人工文本在新闻和社交媒体中普遍使用,然而,这些模型可能会被滥用来生成产品审查、假新闻甚至假政治内容。该文件提出了俄罗斯人造文本探测在对话共同任务2022(RuATD 2022)中的解决方案,以区分清单中的哪些模型用于生成这一文本。我们引入了DBERTA预先培训的语言模式,为这一共同任务制定了多种培训战略。在RUATD数据集上进行的广泛实验证实了我们拟议方法的有效性。此外,我们的呈件在RuATD 2022(Multi-Class)的评估阶段名列第二位。