提高语文模式外分配通用能力:反事实增加数据是不够的 (Improving the Out-Of-Distribution Generalization Capability of Language Models: Counterfactually-Augmented Data is not Enough)

Counterfactually-Augmented Data (CAD) has the potential to improve language models' Out-Of-Distribution (OOD) generalization capability, as CAD induces language models to exploit causal features and exclude spurious correlations. However, the empirical results of OOD generalization on CAD are not as efficient as expected. In this paper, we attribute the inefficiency to Myopia Phenomenon caused by CAD: language models only focus on causal features that are edited in the augmentation and exclude other non-edited causal features. As a result, the potential of CAD is not fully exploited. Based on the structural properties of CAD, we design two additional constraints to help language models extract more complete causal features contained in CAD, thus improving the OOD generalization capability. We evaluate our method on two tasks: Sentiment Analysis and Natural Language Inference, and the experimental results demonstrate that our method could unlock CAD's potential and improve language models' OOD generalization capability.

翻译：反事实增强数据(CAD)有可能改进语言模型的“分发外(OOD)一般化能力,因为CAD引导语言模型利用因果特征,排除虚假关联。然而,OOD对CAD的概括化经验结果不如预期有效。在本文中,我们把效率低下归因于CAD造成的 Myopia Phenomnon:语言模型只侧重于在扩增中编辑的因果特征,而排除了其他未经编辑的因果特征。因此,CAD的潜力没有得到充分利用。基于CAD的结构特性,我们设计了两个额外的制约因素,以帮助语言模型提取出CAD所载的更完整的因果特征,从而改进OOD的概括化能力。我们评估了我们关于两项任务的方法:感应分析和自然语言推断,实验结果表明我们的方法可以释放CAD的潜力,提高语言模型的OD一般化能力。

相关内容

CAD

关注 3

《计算机辅助设计》是一份领先的国际期刊，为学术界和工业界提供有关计算机应用于设计的研究和发展的重要论文。计算机辅助设计邀请论文报告新的研究以及新颖或特别重要的应用，在广泛的主题中，跨越所有阶段的设计过程，从概念创造到制造超越。官网地址：http://dblp.uni-trier.de/db/journals/cad/

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日