Traditionally, for most machine learning settings, gaining some degree of explainability that tries to give users more insights into how and why the network arrives at its predictions, restricts the underlying model and hinders performance to a certain degree. For example, decision trees are thought of as being more explainable than deep neural networks but they lack performance on visual tasks. In this work, we empirically demonstrate that applying methods and architectures from the explainability literature can, in fact, achieve state-of-the-art performance for the challenging task of domain generalization while offering a framework for more insights into the prediction and training process. For that, we develop a set of novel algorithms including DivCAM, an approach where the network receives guidance during training via gradient based class activation maps to focus on a diverse set of discriminative features, as well as ProDrop and D-Transformers which apply prototypical networks to the domain generalization task, either with self-challenging or attention alignment. Since these methods offer competitive performance on top of explainability, we argue that the proposed methods can be used as a tool to improve the robustness of deep neural network architectures.
翻译:传统上,对于大多数机器学习环境来说,获得某种程度的解释性,试图让用户更深入地了解网络如何和为什么得出预测,限制基本模型,并在某种程度上阻碍性能。例如,决策树被认为比深层神经网络更能解释,但是在视觉任务方面却缺乏性能。在这项工作中,我们从经验上证明,应用解释性文献中的方法和结构,实际上可以实现具有挑战性的域概括化任务的最先进性能,同时提供一个框架,以便更深入地了解预测和培训过程。为此,我们开发了一套新颖的算法,包括DivCAM。DivCAM,这是一种方法,即网络在培训期间通过基于梯度的班级激活地图获得指导,以关注多种歧视特征,以及ProDrop和D-Transexts, 将准型网络应用于域一般化任务,要么是自我筛选或注意力调整。由于这些方法在解释性上提供了竞争性的性能,因此,我们认为,拟议的方法可以用作提高深层神经网络结构坚固度的工具。