Rationales, snippets of extracted text that explain an inference, have emerged as a popular framework for interpretable natural language processing (NLP). Rationale models typically consist of two cooperating modules: a selector and a classifier with the goal of maximizing the mutual information (MMI) between the "selected" text and the document label. Despite their promises, MMI-based methods often pick up on spurious text patterns and result in models with nonsensical behaviors. In this work, we investigate whether counterfactual data augmentation (CDA), without human assistance, can improve the performance of the selector by lowering the mutual information between spurious signals and the document label. Our counterfactuals are produced in an unsupervised fashion using class-dependent generative models. From an information theoretic lens, we derive properties of the unaugmented dataset for which our CDA approach would succeed. The effectiveness of CDA is empirically evaluated by comparing against several baselines including an improved MMI-based rationale schema on two multi aspect datasets. Our results show that CDA produces rationales that better capture the signal of interest.
翻译:解释推理的摘录文本的片段,作为解释自然语言处理(NLP)的流行框架而出现。推理模型通常由两个合作模块组成:一个选择器和一个分类器,目的是最大限度地扩大“选定”文本和文件标签之间的相互信息(MMI),尽管它们作出了承诺,但以MMI为基础的方法往往会抓住虚假文本模式,并产生非敏感行为的模型。在这项工作中,我们调查在没有人类援助的情况下,反事实数据增强(CDA)能否通过降低虚假信号和文件标签之间的相互信息来改进选择器的性能。我们的反事实是以不受监督的方式,使用基于阶级的基因化模型生成的。我们从一个信息理论透镜中得出了我们CDA方法成功的未经强化的数据集的属性。CDA的有效性通过比较若干基线,包括改进的基于MMI的理由在两个多方面数据集上的模型来进行实验性评估,从而得到评估。我们的结果显示,CDA产生了更好地捕捉取利益信号的理由。