了解合作合理化的相互联系的动态 (Understanding Interlocking Dynamics of Cooperative Rationalization)

Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output. The selection mechanism is commonly integrated into the model itself by specifying a two-component cascaded system consisting of a rationale generator, which makes a binary selection of the input features (which is the rationale), and a predictor, which predicts the output based only on the selected features. The components are trained jointly to optimize prediction performance. In this paper, we reveal a major problem with such cooperative rationalization paradigm -- model interlocking. Interlocking arises when the predictor overfits to the features selected by the generator thus reinforcing the generator's selection even if the selected rationales are sub-optimal. The fundamental cause of the interlocking problem is that the rationalization objective to be minimized is concave with respect to the generator's selection policy. We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection. The generator now realizes both soft and hard attention over the features and these are fed into the two different predictors. While the generator still seeks to support the original predictor performance, it also minimizes a gap between the two predictors. As we will show theoretically, since the attention-based predictor exhibits a better convexity property, A2R can overcome the concavity barrier. Our experiments on two synthetic benchmarks and two real datasets demonstrate that A2R can significantly alleviate the interlock problem and find explanations that better align with human judgments. We release our code at https://github.com/Gorov/Understanding_Interlocking.

翻译：选择性合理化可以解释对复杂神经网络的预测, 方法是通过寻找能够预测神经模型输出的一小部分投入, 从而找到足以预测神经模型输出的一小部分输入。选择机制通常被整合到模型本身中。选择机制通常被整合到模型本身中, 具体指定出一个由两部分组成的分层系统, 包括一个原理生成器, 使输入特性的二进制选择( 原理) 和一个预测器, 预测输出输出的二进制。我们建议一个新的合理化框架, 叫做 A2R, 将第三个组成部分引入架构, 由软性关注驱动, 而不是选择。发电机现在意识到对功能的注意既软又硬, 从而强化了发电机内部选择, 即使选定的原理是次最佳的。连接问题的根本原因是, 要最小化输入输入输入输入输入输入输入的输入功能特性( 原理) 。我们的计算结果显示的是, 更精确的计算器将显示我们两个原始的预测值。