Mitigating the dependence on spurious correlations present in the training dataset is a quickly emerging and important topic of deep learning. Recent approaches include priors on the feature attribution of a deep neural network (DNN) into the training process to reduce the dependence on unwanted features. However, until now one needed to trade off high-quality attributions, satisfying desirable axioms, against the time required to compute them. This in turn either led to long training times or ineffective attribution priors. In this work, we break this trade-off by considering a special class of efficiently axiomatically attributable DNNs for which an axiomatic feature attribution can be computed with only a single forward/backward pass. We formally prove that nonnegatively homogeneous DNNs, here termed $\mathcal{X}$-DNNs, are efficiently axiomatically attributable and show that they can be effortlessly constructed from a wide range of regular DNNs by simply removing the bias term of each layer. Various experiments demonstrate the advantages of $\mathcal{X}$-DNNs, beating state-of-the-art generic attribution methods on regular DNNs for training with attribution priors.
翻译:减轻对培训数据集中存在的虚假关联的依赖是一个迅速出现的重要深层次学习的重要问题。最近的一些方法包括:在培训过程中,将深神经网络(DNN)的特征属性纳入培训过程,以减少对不想要的特征的依赖。然而,直到现在为止,需要用高品质属性进行交换,满足理想的轴心,而不是计算它们所需要的时间。这反过来又会导致长时间的培训时间过长,或者没有有效的归属前期。在这项工作中,我们通过考虑一个特殊类别,即高效的、不相干可归属的DNN(DNN),只能用单一的前向/后过关来计算。我们正式证明,这里称为$\mathcal{X}$-DNNN(美元)的非负式同质的DNNNN(D)是高效的同质属性,表明它们可以通过简单地消除每一层次的偏差术语,从广泛的常规的DNNNN(D)中不努力地建立起来。各种实验表明$\mathcal{X}-DNNS($-DNNS)的正值优势,用先前的归属方法来打击先前的典型的同一属性。