扭曲的注意力更易解释吗? (Is Sparse Attention more Interpretable?)

Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs. Yet the attention distribution is typically over representations internal to the model rather than the inputs themselves, suggesting this assumption may not have merit. We build on the recent work exploring the interpretability of attention; we design a set of experiments to help us understand how sparsity affects our ability to use attention as an explainability tool. On three text classification tasks, we verify that only a weak relationship between inputs and co-indexed intermediate representations exists -- under sparse attention and otherwise. Further, we do not find any plausible mappings from sparse attention distributions to a sparse set of influential inputs through other avenues. Rather, we observe in this setting that inducing sparsity may make it less plausible that attention can be used as a tool for understanding model behavior.

翻译：在强调有影响力的投入的假设下,人们声称对提高模型可解释性的关注不够,声称这种关注增加了模型可解释性,然而,这种关注的分布通常超过模型内部的表达,而不是投入本身的表达,这表明这一假设可能没有价值。我们以最近探讨关注可解释性的工作为基础;我们设计了一系列实验,以帮助我们理解将关注作为一种解释性工具影响我们使用关注的能力。关于三个文本分类任务,我们核实投入和共同索引的中间表述之间存在的薄弱关系 -- -- 缺乏重视和其他方面。此外,我们没有发现从微小的注意力分布到通过其他途径的少数有影响力的投入的任何貌似有理的图象。相反,我们注意到,在这种背景下,诱导引力的过度性可能使人们不太可信地认为,可将关注用作理解模式行为的工具。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【万字长文】注意力机制可解释大论述

专知会员服务

56+阅读 · 2020年11月17日

可解释机器学习（Interpretable Machine Learning）：打开黑盒之谜（238页书籍下载）

专知会员服务

152+阅读 · 2019年10月27日

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日