In attempts to produce ML models less reliant on spurious patterns in NLP datasets, researchers have recently proposed curating counterfactually augmented data (CAD) via a human-in-the-loop process in which given some documents and their (initial) labels, humans must revise the text to make a counterfactual label applicable. Importantly, edits that are not necessary to flip the applicable label are prohibited. Models trained on the augmented data appear, empirically, to rely less on semantically irrelevant words and to generalize better out of domain. While this work draws loosely on causal thinking, the underlying causal model (even at an abstract level) and the principles underlying the observed out-of-domain improvements remain unclear. In this paper, we introduce a toy analog based on linear Gaussian models, observing interesting relationships between causal models, measurement noise, out-of-domain generalization, and reliance on spurious signals. Our analysis provides some insights that help to explain the efficacy of CAD. Moreover, we develop the hypothesis that while adding noise to causal features should degrade both in-domain and out-of-domain performance, adding noise to non-causal features should lead to relative improvements in out-of-domain performance. This idea inspires a speculative test for determining whether a feature attribution technique has identified the causal spans. If adding noise (e.g., by random word flips) to the highlighted spans degrades both in-domain and out-of-domain performance on a battery of challenge datasets, but adding noise to the complement gives improvements out-of-domain, it suggests we have identified causal spans. We present a large-scale empirical study comparing spans edited to create CAD to those selected by attention and saliency maps. Across numerous domains and models, we find that the hypothesized phenomenon is pronounced for CAD.
翻译:在试图生成不依赖于 NLP 数据集中虚假型态的 ML 模型时,研究人员最近提议通过提供某些文档及其(初始)标签的人为环形进程,校正反事实增加的数据(CAD ), 人类必须修改文本, 以适用反事实标签。 重要的是, 禁止不需要进行修改以翻转适用标签的错误。 受过强化数据培训的模型, 从经验上看, 似乎不那么依赖不相干的义词, 更全面地推广。 虽然这项工作以因果关系思维为主, 基本因果模型(即使是抽象的) 和所观察到的场外改进的原则仍然不明确。 在本文件中, 我们根据线性模型, 观察因果关系模型、 测量噪音、 外线性一般化和 依赖虚假信号之间的有趣关系。 我们的分析提供了一些有助于解释 CAD 有效性的洞察点。 此外, 我们开发了一个假设, 在增加因果特性的特性时, 既要降低内部的和外部的性能显示性能的性能, 也显示C- 显示, 判断性能的性能的性能的性能的性能性能, 显示, 确定C- 和外部性能的性能的性能的性能的性能的性能的性能的性能的性能的性能的性能, 显示, 显示是否的性能的性能的性能的性能的性能, 显示, 显示, 判断性能的性能的性能, 确定, 确定是否是, 是否是, 判断性能的性能的性能的性能的性能的性能的性能, 。