深神经网络模型解释的有力性:调查 (On the Robustness of Explanations of Deep Neural Network Models: A Survey)

Explainability has been widely stated as a cornerstone of the responsible and trustworthy use of machine learning models. With the ubiquitous use of Deep Neural Network (DNN) models expanding to risk-sensitive and safety-critical domains, many methods have been proposed to explain the decisions of these models. Recent years have also seen concerted efforts that have shown how such explanations can be distorted (attacked) by minor input perturbations. While there have been many surveys that review explainability methods themselves, there has been no effort hitherto to assimilate the different methods and metrics proposed to study the robustness of explanations of DNN models. In this work, we present a comprehensive survey of methods that study, understand, attack, and defend explanations of DNN models. We also present a detailed review of different metrics used to evaluate explanation methods, as well as describe attributional attack and defense methods. We conclude with lessons and take-aways for the community towards ensuring robust explanations of DNN model predictions.

翻译：人们广泛认为,解释性是负责任和可信赖地使用机器学习模型的基石。随着深神经网络模型普遍被用于扩大至风险敏感和安全关键领域,提出了许多方法来解释这些模型的决定。近年来,人们还一致作出努力,表明这种解释如何被小的输入干扰扭曲(攻击)。虽然许多调查都审查了解释性方法本身,但迄今没有努力吸收为研究DNN模型的可靠解释而提出的不同方法和指标。我们在此工作中对研究、理解、攻击和维护DNN模型解释的方法进行了全面调查。我们还详细审查了用于评价解释方法的不同指标,并描述了归属攻击和防御方法。我们总结了社区的经验教训和取舍,以确保对DNNN模型预测作出有力的解释。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

21+阅读 · 2020年6月4日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日