用于解释阿塔里制剂的基于扰扰动的弹性图 (Benchmarking Perturbation-based Saliency Maps for Explaining Atari Agents) - 专知论文

会员服务 ·

0

Atari · 学成 · 深度强化学习 · 图片分类 · 强化学习 ·

2021 年 6 月 19 日

Benchmarking Perturbation-based Saliency Maps for Explaining Atari Agents

翻译：用于解释阿塔里制剂的基于扰扰动的弹性图

Tobias Huber,Benedikt Limmer,Elisabeth André

Recent years saw a plethora of work on explaining complex intelligent agents. One example is the development of several algorithms that generate saliency maps which show how much each pixel attributed to the agents' decision. However, most evaluations of such saliency maps focus on image classification tasks. As far as we know, there is no work that thoroughly compares different saliency maps for Deep Reinforcement Learning agents. This paper compares four perturbation-based approaches to create saliency maps for Deep Reinforcement Learning agents trained on four different Atari 2600 games. All four approaches work by perturbing parts of the input and measuring how much this affects the agent's output. The approaches are compared using three computational metrics: dependence on the learned parameters of the agent (sanity checks), faithfulness to the agent's reasoning (input degradation), and run-time. In particular, during the sanity checks we find issues with two approaches and propose a solution to fix one of those issues.

翻译：近些年来,在解释复杂的智能剂方面做了大量工作。一个例子是开发了几种算法,这些算法生成了显示每个像素在多大程度上归因于代理人的决定的显著图象。然而,对此类突出图的多数评价侧重于图像分类任务。据我们所知,没有一项工作彻底比较深强化学习剂的不同突出图象。本文比较了四种以扰动为基础的方法,为在四个不同的Atari 2600游戏中受过训练的深强化学习剂制作突出图象。所有四种方法都通过干扰部分输入和测量它对代理人产出的影响。三种计算指标比较了这些方法:依赖代理人的学习参数(卫生检查)、对代理人推理的忠诚性(生产退化)以及运行时间。特别是在理智检查过程中,我们发现两种方法的问题,并提出解决其中一种的方法。

0

相关内容

Atari

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

Arxiv

0+阅读 · 2021年8月20日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Deep Stable Learning for Out-Of-Distribution Generalization

Arxiv

12+阅读 · 2021年4月16日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

Multi-task Deep Reinforcement Learning with PopArt

Multi-task Deep Reinforcement Learning with PopArt

Arxiv

4+阅读 · 2018年9月12日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Automatic multi-objective based feature selection for classification

Automatic multi-objective based feature selection for classification

Arxiv

6+阅读 · 2018年7月9日

Sequential Attacks on Agents for Long-Term Adversarial Goals

Sequential Attacks on Agents for Long-Term Adversarial Goals

Arxiv

5+阅读 · 2018年7月5日

Explaining and Harnessing Adversarial Examples

Arxiv

4+阅读 · 2015年3月20日

VIP会员

文章信息

相关主题

深度强化学习

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

7+阅读 · 2020年5月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

PatchCleanser: Certifiably Robust Defense against Adversarial Patches for Any Image Classifier

Arxiv

0+阅读 · 2021年8月20日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Deep Stable Learning for Out-Of-Distribution Generalization

Arxiv

12+阅读 · 2021年4月16日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

Multi-task Deep Reinforcement Learning with PopArt

Multi-task Deep Reinforcement Learning with PopArt

Arxiv

4+阅读 · 2018年9月12日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Automatic multi-objective based feature selection for classification

Automatic multi-objective based feature selection for classification

Arxiv

6+阅读 · 2018年7月9日

Sequential Attacks on Agents for Long-Term Adversarial Goals

Sequential Attacks on Agents for Long-Term Adversarial Goals

Arxiv

5+阅读 · 2018年7月5日

Explaining and Harnessing Adversarial Examples

Arxiv

4+阅读 · 2015年3月20日

微信扫码咨询专知VIP会员