带Aleatoric 绘图仪器的逃生斯托卡拖拉机 (Escaping Stochastic Traps with Aleatoric Mapping Agents) - 专知论文

会员服务 ·

0

前向预测 · 可约的 · 方差 · 回合 · AIM ·

2021 年 10 月 8 日

Escaping Stochastic Traps with Aleatoric Mapping Agents

翻译：带Aleatoric 绘图仪器的逃生斯托卡拖拉机

Augustine N. Mavor-Parker,Kimberly A. Young,Caswell Barry,Lewis D. Griffin

from arxiv, Previously Presented at the NeurIPS (2020) Biological and Artificial Reinforcement Learning Workshop

Exploration in environments with sparse rewards is difficult for artificial agents. Curiosity driven learning -- using feed-forward prediction errors as intrinsic rewards -- has achieved some success in these scenarios, but fails when faced with action-dependent noise sources. We present aleatoric mapping agents (AMAs), a neuroscience inspired solution modeled on the cholinergic system of the mammalian brain. AMAs aim to explicitly ascertain which dynamics of the environment are unpredictable, regardless of whether those dynamics are induced by the actions of the agent. This is achieved by generating separate forward predictions for the mean and variance of future states and reducing intrinsic rewards for those transitions with high aleatoric variance. We show AMAs are able to effectively circumvent action-dependent stochastic traps that immobilise conventional curiosity driven agents. The code for all experiments presented in this paper is open sourced: http://github.com/self-supervisor/Escaping-Stochastic-Traps-With-Aleatoric-Mapping-Agents.

翻译：人工剂很难在环境里进行回报微薄的探索。由好奇力驱动的学习 -- -- 将进化预测错误作为内在的回报 -- -- 在这些情景中取得了一些成功,但在面对依赖行动的噪音源时却失败了。我们展示了以哺乳动物大脑的胆碱基系统为模型的神经科学启发型解决方案AMAs。 AMAs旨在明确确定环境的哪些动态是无法预测的,而不论这些动态是否是由该剂的行为引起的。这是通过分别预测未来状态的平均值和差异,并减少这些变化的内在回报而实现的。我们显示, AMAs能够有效地绕过依赖行动的切析陷阱,使传统的好奇力驱动剂无法移动。本文提出的所有实验的代码都是开源的 : http://github. com/sel- supervisor/ Escaping-Stochastic- trapts- with-Aleator-Mappic-Agents。

0

相关内容

前向预测

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

专知会员服务

10+阅读 · 2019年11月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Safe Exploration for Constrained Reinforcement Learning with Provable Guarantees

Arxiv

0+阅读 · 2021年12月1日

Neural Stochastic Dual Dynamic Programming

Arxiv

0+阅读 · 2021年12月1日

Controlling Conditional Language Models with Distributional Policy Gradients

Arxiv

0+阅读 · 2021年12月1日

Stochastic High Fidelity Simulation and Scenarios for Testing of Fixed Wing Autonomous GNSS-Denied Navigation Algorithms

Arxiv

0+阅读 · 2021年12月1日

Public Data-Assisted Mirror Descent for Private Model Training

Arxiv

0+阅读 · 2021年12月1日

Delay-aware Robust Control for Safe Autonomous Driving

Delay-aware Robust Control for Safe Autonomous Driving

Arxiv

0+阅读 · 2021年11月30日

NeRFReN: Neural Radiance Fields with Reflections

Arxiv

1+阅读 · 2021年11月30日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Improved Image Captioning via Policy Gradient optimization of SPIDEr

Arxiv

6+阅读 · 2018年3月12日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

专知会员服务

10+阅读 · 2019年11月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Safe Exploration for Constrained Reinforcement Learning with Provable Guarantees

Arxiv

0+阅读 · 2021年12月1日

Neural Stochastic Dual Dynamic Programming

Arxiv

0+阅读 · 2021年12月1日

Controlling Conditional Language Models with Distributional Policy Gradients

Arxiv

0+阅读 · 2021年12月1日

Stochastic High Fidelity Simulation and Scenarios for Testing of Fixed Wing Autonomous GNSS-Denied Navigation Algorithms

Arxiv

0+阅读 · 2021年12月1日

Public Data-Assisted Mirror Descent for Private Model Training

Arxiv

0+阅读 · 2021年12月1日

Delay-aware Robust Control for Safe Autonomous Driving

Delay-aware Robust Control for Safe Autonomous Driving

Arxiv

0+阅读 · 2021年11月30日

NeRFReN: Neural Radiance Fields with Reflections

Arxiv

1+阅读 · 2021年11月30日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Improved Image Captioning via Policy Gradient optimization of SPIDEr

Arxiv

6+阅读 · 2018年3月12日

微信扫码咨询专知VIP会员