建立对不可靠的神经神经网络的可靠解释:示范解释的当地平滑视角 (Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation) - 专知论文

会员服务 ·

0

可辨认的 · MoDELS · 平滑 · Neural Networks · Networks ·

2021 年 3 月 26 日

Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation

翻译：建立对不可靠的神经神经网络的可靠解释:示范解释的当地平滑视角

Dohun Lim,Hyeonseok Lee,Sungchan Kim

from arxiv, 32 pages, 23 figures

We present a novel method for reliably explaining the predictions of neural networks. We consider an explanation reliable if it identifies input features relevant to the model output by considering the input and the neighboring data points. Our method is built on top of the assumption of smooth landscape in a loss function of the model prediction: locally consistent loss and gradient profile. A theoretical analysis established in this study suggests that those locally smooth model explanations are learned using a batch of noisy copies of the input with the L1 regularization for a saliency map. Extensive experiments support the analysis results, revealing that the proposed saliency maps retrieve the original classes of adversarial examples crafted against both naturally and adversarially trained models, significantly outperforming previous methods. We further demonstrated that such good performance results from the learning capability of this method to identify input features that are truly relevant to the model output of the input and the neighboring data points, fulfilling the requirements of a reliable explanation.

翻译：我们提出了一个可靠解释神经网络预测的新方法。我们认为,如果通过考虑输入和相邻数据点来确定与模型输出相关的输入特征,解释是可靠的。我们的方法建立在模型预测损失函数中光滑地貌假设之上:地方一致的损失和梯度剖面。本研究中建立的一项理论分析表明,这些本地平稳的模型解释是利用一组与突出的地图L1正规化相关的输入的杂音拷贝来学习的。广泛的实验支持了分析结果,表明拟议的突出地图检索了针对自然和对抗性训练的模型所制作的最初类别的对抗性范例,大大超过以往的方法。我们进一步表明,这一方法的学习能力取得了良好的绩效,从而确定了与输入和相邻数据点的模型输出真正相关的投入特征,满足了可靠解释的要求。

0

相关内容

可辨认的

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

专知会员服务

275+阅读 · 2019年10月25日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

An Orthogonal Classifier for Improving the Adversarial Robustness of Neural Networks

Arxiv

1+阅读 · 2021年5月19日

Effective Attention Sheds Light On Interpretability

Arxiv

0+阅读 · 2021年5月18日

Graph-based Hierarchical Relevance Matching Signals for Ad-hoc Retrieval

Arxiv

10+阅读 · 2021年2月22日

Reliable Graph Neural Networks via Robust Aggregation

Arxiv

9+阅读 · 2020年10月29日

Interpretable Sequence Classification via Discrete Optimization

Arxiv

8+阅读 · 2020年10月6日

Interpretable Adversarial Training for Text

Interpretable Adversarial Training for Text

Arxiv

5+阅读 · 2019年5月30日

Relational Graph Attention Networks

Relational Graph Attention Networks

Arxiv

3+阅读 · 2019年4月11日

Cloze-driven Pretraining of Self-attention Networks

Arxiv

6+阅读 · 2019年3月19日

Interpretable Convolutional Neural Networks via Feedforward Design

Interpretable Convolutional Neural Networks via Feedforward Design

Arxiv

4+阅读 · 2018年10月5日

Good Features to Correlate for Visual Tracking

Arxiv

10+阅读 · 2018年3月10日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

【新书】Python机器学习实战，545页pdf，Practical Machine Learning with Python

专知会员服务

310+阅读 · 2020年2月26日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

专知会员服务

275+阅读 · 2019年10月25日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

An Orthogonal Classifier for Improving the Adversarial Robustness of Neural Networks

Arxiv

1+阅读 · 2021年5月19日

Effective Attention Sheds Light On Interpretability

Arxiv

0+阅读 · 2021年5月18日

Graph-based Hierarchical Relevance Matching Signals for Ad-hoc Retrieval

Arxiv

10+阅读 · 2021年2月22日

Reliable Graph Neural Networks via Robust Aggregation

Arxiv

9+阅读 · 2020年10月29日

Interpretable Sequence Classification via Discrete Optimization

Arxiv

8+阅读 · 2020年10月6日

Interpretable Adversarial Training for Text

Interpretable Adversarial Training for Text

Arxiv

5+阅读 · 2019年5月30日

Relational Graph Attention Networks

Relational Graph Attention Networks

Arxiv

3+阅读 · 2019年4月11日

Cloze-driven Pretraining of Self-attention Networks

Arxiv

6+阅读 · 2019年3月19日

Interpretable Convolutional Neural Networks via Feedforward Design

Interpretable Convolutional Neural Networks via Feedforward Design

Arxiv

4+阅读 · 2018年10月5日

Good Features to Correlate for Visual Tracking

Arxiv

10+阅读 · 2018年3月10日

微信扫码咨询专知VIP会员