Recently proposed BERT-based evaluation metrics for text generation perform well on standard benchmarks but are vulnerable to adversarial attacks, e.g., relating to information correctness. We argue that this stems (in part) from the fact that they are models of semantic similarity. In contrast, we develop evaluation metrics based on Natural Language Inference (NLI), which we deem a more appropriate modeling. We design a preference-based adversarial attack framework and show that our NLI based metrics are much more robust to the attacks than the recent BERT-based metrics. On standard benchmarks, our NLI based metrics outperform existing summarization metrics, but perform below SOTA MT metrics. However, when combining existing metrics with our NLI metrics, we obtain both higher adversarial robustness (15%-30%) and higher quality metrics as measured on standard benchmarks (+5% to 30%).
翻译:最近,针对文本生成提出的基于BERT的评价指标在标准基准上表现良好,但容易受到对信息正确性的对抗攻击。我们认为这部分原因在于它们是语义相似性模型。相比之下,我们开发了一种基于自然语言推理(NLI)的评价指标,认为这是一种更合适的模型。我们设计了一种基于偏好的对抗攻击框架,并展示了我们基于NLI的指标比最近的基于BERT的指标更具鲁棒性。在标准基准上,我们的NLI指标优于现有的摘要指标,但低于SOTA MT指标。然而,将现有的指标与我们的NLI指标相结合,我们获得了更高的对抗鲁棒性(15%-30%)和更高的标准基准质量指标(+5%至30%)。