Despite the remarkable performance, Deep Neural Networks (DNNs) behave as black-boxes hindering user trust in Artificial Intelligence (AI) systems. Research on opening black-box DNN can be broadly categorized into post-hoc methods and inherently interpretable DNNs. While many surveys have been conducted on post-hoc interpretation methods, little effort is devoted to inherently interpretable DNNs. This paper provides a review of existing methods to develop DNNs with intrinsic interpretability, with a focus on Convolutional Neural Networks (CNNs). The aim is to understand the current progress towards fully interpretable DNNs that can cater to different interpretation requirements. Finally, we identify gaps in current work and suggest potential research directions.
翻译:尽管取得了显著的成绩,但深神经网络(DNN)作为黑箱的行为妨碍了用户对人工智能系统的信任。关于打开黑箱 DNN的研究可以大致分为热后方法和内在可解释的DNNs。虽然已经对热后解释方法进行了许多调查,但对内在可解释的DNNs却没有做出多少努力。本文件审查了现有开发具有内在可解释性的DNNs的方法,重点是进化神经网络。目的是了解目前朝着完全可解释的DNNs(能够满足不同的解释要求)所取得的进展。最后,我们找出当前工作中的差距,并提出潜在的研究方向。