Neural language models' (NLMs') reasoning processes are notoriously hard to explain. Recently, there has been much progress in automatically generating machine rationales of NLM behavior, but less in utilizing the rationales to improve NLM behavior. For the latter, explanation regularization (ER) aims to improve NLM generalization by pushing the machine rationales to align with human rationales. Whereas prior works primarily evaluate such ER models via in-distribution (ID) generalization, ER's impact on out-of-distribution (OOD) is largely underexplored. Plus, little is understood about how ER model performance is affected by the choice of ER criteria or by the number/choice of training instances with human rationales. In light of this, we propose ER-TEST, a protocol for evaluating ER models' OOD generalization along three dimensions: (1) unseen datasets, (2) contrast set tests, and (3) functional tests. Using ER-TEST, we study three key questions: (A) Which ER criteria are most effective for the given OOD setting? (B) How is ER affected by the number/choice of training instances with human rationales? (C) Is ER effective with distantly supervised human rationales? ER-TEST enables comprehensive analysis of these questions by considering a diverse range of tasks and datasets. Through ER-TEST, we show that ER has little impact on ID performance, but can yield large gains on OOD performance w.r.t. (1)-(3). Also, we find that the best ER criterion is task-dependent, while ER can improve OOD performance even with limited and distantly-supervised human rationales.
翻译:神经语言模型( NLM ” ) 推理过程是众所周知的难以解释的。 最近,在自动生成机器解释 NLM 行为的理由方面取得了很大进展,但在利用理由来改进 NLM 行为方面进展不大。 对于后者,解释正规化(ER) 的目的是通过推动机器推理来改进NLM 的概括化。 虽然先前的工作主要是通过分布(ID) 概括化来评估这种ER 模型,但ER对分配(OOOD)的影响在很大程度上没有得到充分探讨。 此外,对于ER模型的性能如何受到选择ER标准的影响,或者使用数量/选择来改进NLM 行为。 对于后者,我们提出了ER-TE, 评估ER 模型OOO 常识化的程序有三个方面:(1) 隐秘的数据集,(2) 对比测试,(3) 功能测试。我们利用ER-TET 研究三个关键问题:(A) 哪种ER 标准对给定的OD设置最为有效,但效果如何? (B) 如何影响ER-OEST 的精确化数据分析过程?