If two sentences have the same meaning, it should follow that they are equivalent in their inferential properties, i.e., each sentence should textually entail the other. However, many paraphrase datasets currently in widespread use rely on a sense of paraphrase based on word overlap and syntax. Can we teach them instead to identify paraphrases in a way that draws on the inferential properties of the sentences, and is not over-reliant on lexical and syntactic similarities of a sentence pair? We apply the adversarial paradigm to this question, and introduce a new adversarial method of dataset creation for paraphrase identification: the Adversarial Paraphrasing Task (APT), which asks participants to generate semantically equivalent (in the sense of mutually implicative) but lexically and syntactically disparate paraphrases. These sentence pairs can then be used both to test paraphrase identification models (which get barely random accuracy) and then improve their performance. To accelerate dataset generation, we explore automation of APT using T5, and show that the resulting dataset also improves accuracy. We discuss implications for paraphrase detection and release our dataset in the hope of making paraphrase detection models better able to detect sentence-level meaning equivalence.
翻译:如果两句含义相同,那么就应该认为两句在推论性质上是等同的,也就是说,每一句应文字包含另一句。然而,目前广泛使用的许多参数数据集依赖于基于词重叠和语法的推理语感。我们能否教它们以借鉴判决推论性质的方式识别引言语,而不是过度依赖对句的法理和同义性相似性?我们对这个问题适用对抗性范式,并采用新的对抗性数据集创建对抗性方法进行引言识别:Adversarial Paraphrasing任务(APT),该任务要求参与者产生语义等同(相互含意的意义上),但用词法和同用词法不同。这些对词可以用来测试参数识别模式(几乎不随机的准确性),然后改进其性能。为了加速数据设置,我们探索APT5的自动化,并显示由此产生的数据设置也提高了数据的精确度。我们讨论了对等义的检测意义,以便更准确地测测测测数据。