The field of explainable AI has recently seen an explosion in the number of explanation methods for highly non-linear deep neural networks. The extent to which such methods -- that are often proposed and tested in the domain of computer vision -- are appropriate to address the explainability challenges in NLP is yet relatively unexplored. In this work, we consider Contextual Decomposition (CD) -- a Shapley-based input feature attribution method that has been shown to work well for recurrent NLP models -- and we test the extent to which it is useful for models that contain attention operations. To this end, we extend CD to cover the operations necessary for attention-based models. We then compare how long distance subject-verb relationships are processed by models with and without attention, considering a number of different syntactic structures in two different languages: English and Dutch. Our experiments confirm that CD can successfully be applied for attention-based models as well, providing an alternative Shapley-based attribution method for modern neural networks. In particular, using CD, we show that the English and Dutch models demonstrate similar processing behaviour, but that under the hood there are consistent differences between our attention and non-attention models.
翻译:可以解释的AI领域最近看到高度非线性深海神经网络解释方法的数量激增。这些方法 -- -- 经常在计算机视野领域提出和测试的 -- -- 在多大程度上适合解决NLP的可解释性挑战,但相对而言尚未探索。在这项工作中,我们认为背景分解(CD) -- -- 一种基于毛质的输入特征归属方法 -- -- 已经证明对经常性NLP模式行之有效 -- -- 我们测试它在多大程度上对包含关注操作的模型有用。为此,我们扩大CD,以涵盖基于关注模型的必要操作。我们随后比较由模型处理的远程主题动词关系有多长,是否受到关注,同时考虑到两种不同语言:英语和荷兰语的不同合成结构。我们的实验证实CD可以成功地应用于基于关注的模式,并为现代神经网络提供一种基于毛质的属性的替代方法。我们特别用CD显示,英语和荷兰模式显示了类似的处理行为,但在这种模式下,我们的关注和不留视而不留之间始终存在差异。