Counterfactually-Augmented Data (CAD) has the potential to improve language models' Out-Of-Distribution (OOD) generalization capability, as CAD induces language models to exploit causal features and exclude spurious correlations. However, the empirical results of OOD generalization on CAD are not as efficient as expected. In this paper, we attribute the inefficiency to Myopia Phenomenon caused by CAD: language models only focus on causal features that are edited in the augmentation and exclude other non-edited causal features. As a result, the potential of CAD is not fully exploited. Based on the structural properties of CAD, we design two additional constraints to help language models extract more complete causal features contained in CAD, thus improving the OOD generalization capability. We evaluate our method on two tasks: Sentiment Analysis and Natural Language Inference, and the experimental results demonstrate that our method could unlock CAD's potential and improve language models' OOD generalization capability.
翻译:反事实增强数据(CAD)有可能改进语言模型的“分发外(OOD)一般化能力,因为CAD引导语言模型利用因果特征,排除虚假关联。然而,OOD对CAD的概括化经验结果不如预期有效。在本文中,我们把效率低下归因于CAD造成的 Myopia Phenomnon:语言模型只侧重于在扩增中编辑的因果特征,而排除了其他未经编辑的因果特征。因此,CAD的潜力没有得到充分利用。基于CAD的结构特性,我们设计了两个额外的制约因素,以帮助语言模型提取出CAD所载的更完整的因果特征,从而改进OOD的概括化能力。我们评估了我们关于两项任务的方法:感应分析和自然语言推断,实验结果表明我们的方法可以释放CAD的潜力,提高语言模型的OD一般化能力。