As the use of deep neural networks continues to grow, understanding their behaviour has become more crucial than ever. Post-hoc explainability methods are a potential solution, but their reliability is being called into question. Our research investigates the response of post-hoc visual explanations to naturally occurring transformations, often referred to as augmentations. We anticipate explanations to be invariant under certain transformations, such as changes to the colour map while responding in an equivariant manner to transformations like translation, object scaling, and rotation. We have found remarkable differences in robustness depending on the type of transformation, with some explainability methods (such as LRP composites and Guided Backprop) being more stable than others. We also explore the role of training with data augmentation. We provide evidence that explanations are typically less robust to augmentation than classification performance, regardless of whether data augmentation is used in training or not.
翻译:随着深度神经网络的使用不断增加,了解其行为比以往更加重要。后期解释能力方法是一个潜在的解决方案,但其可靠性受到了质疑。我们的研究调查了后期视觉解释对自然生成的转换(通常称为扩充)的响应。我们预计在某些转换下,如颜色图的更改时,解释应该是不变的,而对于其他转换,如平移、对象缩放和旋转,应该是等变的。我们发现,稳健性因转换类型而异,某些解释能力方法(如LRP构合和Guided Backprop)比其他方法更稳定。我们还探讨了使用数据扩充进行训练的作用。我们提供证据表明,无论是否在训练中使用数据扩充,解释一般都比分类性能更容易受到扩充的影响。