Transformers increasingly dominate the machine learning landscape across many tasks and domains, which increases the importance for understanding their outputs. While their attention modules provide partial insight into their inner workings, the attention scores have been shown to be insufficient for explaining the models as a whole. To address this, we propose B-cos transformers, which inherently provide holistic explanations for their decisions. Specifically, we formulate each model component - such as the multi-layer perceptrons, attention layers, and the tokenisation module - to be dynamic linear, which allows us to faithfully summarise the entire transformer via a single linear transform. We apply our proposed design to Vision Transformers (ViTs) and show that the resulting models, dubbed Bcos-ViTs, are highly interpretable and perform competitively to baseline ViTs on ImageNet. Code will be made available soon.
翻译:在许多任务和领域,变异器日益主宰着机器学习环境,这增加了理解其产出的重要性。虽然它们的注意模块部分地洞察了它们内部的功能,但注意分数已证明不足以解释整个模型。为了解决这个问题,我们提议B-cos变异器,这些变异器本身就为它们的决定提供了整体的解释。具体地说,我们将每个模型组成部分——例如多层透视器、注意层和代号模块——设计成动态线形,使我们能够通过单一线性变换忠实地总结整个变异器。我们把我们提议的设计应用到视觉变异器(VITs),并表明由此产生的模型,称为Bcos-VITs,非常易解,并且具有竞争性地运行到图像网络上的基线VITs。代码将很快提供。