深心神经网络中的漂移解释 (Foiling Explanations in Deep Neural Networks)

Deep neural networks (DNNs) have greatly impacted numerous fields over the past decade. Yet despite exhibiting superb performance over many problems, their black-box nature still poses a significant challenge with respect to explainability. Indeed, explainable artificial intelligence (XAI) is crucial in several fields, wherein the answer alone -- sans a reasoning of how said answer was derived -- is of little value. This paper uncovers a troubling property of explanation methods for image-based DNNs: by making small visual changes to the input image -- hardly influencing the network's output -- we demonstrate how explanations may be arbitrarily manipulated through the use of evolution strategies. Our novel algorithm, AttaXAI, a model-agnostic, adversarial attack on XAI algorithms, only requires access to the output logits of a classifier and to the explanation map; these weak assumptions render our approach highly useful where real-world models and data are concerned. We compare our method's performance on two benchmark datasets -- CIFAR100 and ImageNet -- using four different pretrained deep-learning models: VGG16-CIFAR100, VGG16-ImageNet, MobileNet-CIFAR100, and Inception-v3-ImageNet. We find that the XAI methods can be manipulated without the use of gradients or other model internals. Our novel algorithm is successfully able to manipulate an image in a manner imperceptible to the human eye, such that the XAI method outputs a specific explanation map. To our knowledge, this is the first such method in a black-box setting, and we believe it has significant value where explainability is desired, required, or legally mandatory.

翻译：深心神经网络(DNNS)在过去十年中对许多领域产生了极大影响。然而,尽管在很多问题上表现出了超强的性能,它们的黑箱性质在解释性方面仍构成巨大的挑战。事实上,可以解释的人工智能(XAI)在许多领域至关重要,而答案本身就要求使用一个分类器的输出日志和解释图。这些薄弱的假设使我们的方法在现实世界模型和数据方面非常有用。我们用两种基准数据集(CIFAR100和图象网)来比较我们的方法表现。我们首先使用四种预知的深层次学习模型:VGG16-CIFAR100,VGIXAIAI,一个对 XAAI算法的模型和对抗性攻击,一个模型的模型和模型解释性能解释性能的模型,一个不使用OVG16-CIFAR100,一个硬性能的模型,一个不使用OVIFFINet-IAILAIL 的模型,另一个方法就是“ULOIAIAIA” 。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日