规则 -- -- 感官环境中的解释 -- -- 说明 -- -- 意识经验重现 (Explanation-Aware Experience Replay in Rule-Dense Environments) - 专知论文

会员服务 ·

0

经验回放 · 回合 · Performer · 簇 · Engineering ·

2021 年 12 月 16 日

Explanation-Aware Experience Replay in Rule-Dense Environments

翻译：规则 -- -- 感官环境中的解释 -- -- 说明 -- -- 意识经验重现

Francesco Sovrano,Alex Raymond,Amanda Prorok

from arxiv, To appear in IEEE Robotics and Automation Letters (IEEE RA-L). Please cite the published version

Human environments are often regulated by explicit and complex rulesets. Integrating Reinforcement Learning (RL) agents into such environments motivates the development of learning mechanisms that perform well in rule-dense and exception-ridden environments such as autonomous driving on regulated roads. In this paper, we propose a method for organising experience by means of partitioning the experience buffer into clusters labelled on a per-explanation basis. We present discrete and continuous navigation environments compatible with modular rulesets and 9 learning tasks. For environments with explainable rulesets, we convert rule-based explanations into case-based explanations by allocating state-transitions into clusters labelled with explanations. This allows us to sample experiences in a curricular and task-oriented manner, focusing on the rarity, importance, and meaning of events. We label this concept Explanation-Awareness (XA). We perform XA experience replay (XAER) with intra and inter-cluster prioritisation, and introduce XA-compatible versions of DQN, TD3, and SAC. Performance is consistently superior with XA versions of those algorithms, compared to traditional Prioritised Experience Replay baselines, indicating that explanation engineering can be used in lieu of reward engineering for environments with explainable features.

翻译：将强化学习(RL)因素纳入这种环境,会推动建立学习机制,在规则严谨和有例外的环境中运作良好,如在受管制的公路上自主驾驶。在本文件中,我们提出一种方法,通过将经验缓冲分成按逐个分类标签的集群,将经验缓冲分成不同和连续的导航环境,与模块规则以及9项学习任务相容。对于有可解释规则的环境,我们将基于规则的解释转换成基于规则的解释,将国家过渡分配成有解释标签的集群。这使我们能够以课程和任务导向的方式,对经验进行抽样,侧重于事件的多样性、重要性和含义。我们将这一概念标注为解释-觉悟(XA)概念。我们进行XA经验重播(XAER),同时进行模块内和跨组间前置,并采用XA兼容的DQN、TD3和SAC等版本。业绩与XA这些算法的版本相比,与传统的惯往常分级经验重现基线一致优于这些格式。我们把解释用于工程环境的替代性解释。

0

相关内容

经验回放

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

4+阅读 · 2019年4月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Inferring Lexicographically-Ordered Rewards from Preferences

Arxiv

0+阅读 · 2022年2月21日

Combining optimal control and learning for autonomous aerial navigation in novel indoor environments

Arxiv

0+阅读 · 2022年2月19日

Selling Information in Competitive Environments

Arxiv

0+阅读 · 2022年2月17日

Improving Experience Replay with Successor Representation

Arxiv

0+阅读 · 2022年2月16日

ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning

Arxiv

6+阅读 · 2021年5月26日

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Arxiv

7+阅读 · 2021年4月14日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Arxiv

6+阅读 · 2018年4月23日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Eigenoption Discovery through the Deep Successor Representation

Arxiv

3+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

4+阅读 · 2019年4月1日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Inferring Lexicographically-Ordered Rewards from Preferences

Arxiv

0+阅读 · 2022年2月21日

Combining optimal control and learning for autonomous aerial navigation in novel indoor environments

Arxiv

0+阅读 · 2022年2月19日

Selling Information in Competitive Environments

Arxiv

0+阅读 · 2022年2月17日

Improving Experience Replay with Successor Representation

Arxiv

0+阅读 · 2022年2月16日

ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning

Arxiv

6+阅读 · 2021年5月26日

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Arxiv

7+阅读 · 2021年4月14日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Arxiv

6+阅读 · 2018年4月23日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Eigenoption Discovery through the Deep Successor Representation

Arxiv

3+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员