Explainable artificial intelligence (XAI) aims to make learning machines less opaque, and offers researchers and practitioners various tools to reveal the decision-making strategies of neural networks. In this work, we investigate how XAI methods can be used for exploring and visualizing the diversity of feature representations learned by Bayesian neural networks (BNNs). Our goal is to provide a global understanding of BNNs by making their decision-making strategies a) visible and tangible through feature visualizations and b) quantitatively measurable with a distance measure learned by contrastive learning. Our work provides new insights into the posterior distribution in terms of human-understandable feature information with regard to the underlying decision-making strategies. Our main findings are the following: 1) global XAI methods can be applied to explain the diversity of decision-making strategies of BNN instances, 2) Monte Carlo dropout exhibits increased diversity in feature representations compared to the multimodal posterior approximation of MultiSWAG, 3) the diversity of learned feature representations highly correlates with the uncertainty estimates, and 4) the inter-mode diversity of the multimodal posterior decreases as the network width increases, while the intra-mode diversity increases. Our findings are consistent with the recent deep neural networks theory, providing additional intuitions about what the theory implies in terms of humanly understandable concepts.
翻译:可解释的人工智能(XAI)旨在降低学习机器的不透明性,并为研究人员和从业人员提供各种工具,以揭示神经网络的决策战略。在这项工作中,我们调查如何利用XAI方法探索和直观地展示拜耳西亚神经网络(BNNS)所学的地貌表现的多样性。我们的目标是通过地貌视觉化和(b)通过地貌化和(c)通过从远距离学习中学习,从数量上看可以衡量。我们的工作为人类无法理解的地貌信息在基本决策战略方面的后方分布提供了新的洞察力。我们的主要结论如下:1)全球XAI方法可以用来解释巴耶斯神经网络(BNNS)所学地貌表现的多样性。2)蒙特卡洛的辍学现象表现显示,与MUFSSWAG的多式联运后方近似近距离比较,3)所学地貌表现的多样性与不确定性的估计高度相关,4)随着网络宽度的增加,多式后方外貌变化的分布,而多式后方的分布方式多样性随着网络的广度的增加,而内部的理论也意味着最近对可理解性理论的理论的理论的含。