Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. In order to visualize the parts of the image that led to a certain classification, existing methods either rely on the obtained attention maps or employ heuristic propagation along the attention graph. In this work, we propose a novel way to compute relevancy for Transformer networks. The method assigns local relevance based on the Deep Taylor Decomposition principle and then propagates these relevancy scores through the layers. This propagation involves attention layers and skip connections, which challenge existing methods. Our solution is based on a specific formulation that is shown to maintain the total relevancy across layers. We benchmark our method on very recent visual Transformer networks, as well as on a text classification problem, and demonstrate a clear advantage over the existing explainability methods.
翻译:自留技术,特别是变异器,正在主导文本处理领域,在计算机视觉分类任务中日益流行。为了直观图像中导致某种分类的部分,现有方法要么依靠获得的注意图,要么在关注图上进行超常传播。在这项工作中,我们提出了计算变异器网络相关性的新办法。这种方法根据深泰勒分解原则指定了地方相关性,然后通过层传播这些相关性的分数。这种传播涉及注意层和跳过连接,对现行方法提出了挑战。我们的解决办法基于一种具体的公式,显示它能够保持各层之间的完全相关性。我们把我们的方法以最近的视觉变异器网络以及文本分类问题作为基准,并展示了相对于现有解释方法的明显优势。