It has been shown that NLI models are usually biased with respect to the word-overlap between premise and hypothesis; they take this feature as a primary cue for predicting the entailment label. In this paper, we focus on an overlooked aspect of the overlap bias in NLI models: the reverse word-overlap bias. Our experimental results demonstrate that current NLI models are highly biased towards the non-entailment label on instances with low overlap, and the existing debiasing methods, which are reportedly successful on existing challenge datasets, are generally ineffective in addressing this category of bias. We investigate the reasons for the emergence of the overlap bias and the role of minority examples in its mitigation. For the former, we find that the word-overlap bias does not stem from pre-training, and for the latter, we observe that in contrast to the accepted assumption, eliminating minority examples does not affect the generalizability of debiasing methods with respect to the overlap bias.
翻译:事实已经表明,NLI模式通常在前提和假设之间的文字重叠方面有偏向;它们把这一特征作为预测隐含的标签的主要提示。在本文中,我们侧重于NLI模式重叠偏差的一个被忽视的方面:倒过来的文字重叠偏差。我们的实验结果显示,目前的NLI模式高度偏向关于低重叠情况的不连带标签,而现有的偏向方法(据说在现有的挑战数据集上是成功的)在处理这一类偏差方面一般是无效的。我们调查重叠偏差出现的原因和少数例子在缓解这种偏差方面的作用。我们发现,在前者中,倒过来的偏差不是来自培训前,而对于后者,我们发现,与公认的假设相反,消除少数例子并不影响与重叠偏差有关的方法的普遍性。