MAGIC: 利用发电机-批评器进行在线POMDP规划学习宏观行动 (MAGIC: Learning Macro-Actions for Online POMDP Planning using Generator-Critic) - 专知论文

会员服务 ·

0

学成 · 部分可观测马尔可夫决策过程 · CC · Principle · 机器人 ·

2021 年 4 月 26 日

MAGIC: Learning Macro-Actions for Online POMDP Planning using Generator-Critic

翻译：MAGIC: 利用发电机-批评器进行在线POMDP规划学习宏观行动

Yiyuan Lee,Panpan Cai,David Hsu

from arxiv, 8.5 pages (+ 1.5 page references, + 1 page appendix)

When robots operate in the real world, they need to handle uncertainties in sensing, acting, and the environment dynamics. Many tasks also require reasoning about long-term consequences of robot decisions. The partially observable Markov decision process (POMDP) offers a principled approach for planning under uncertainty. However, its computational complexity grows exponentially with the planning horizon. We propose to use temporally-extended macro-actions to cut down the effective planning horizon and thus the exponential factor of the complexity. We propose Macro-Action Generator-Critic (MAGIC), an algorithm that learns a macro-action generator using feedback from a planner, and in turn uses the learned macro-actions to condition long-horizon planning. Importantly, the generator is learned to directly maximize the down-stream planning performance. We evaluate MAGIC on several long-term planning tasks, showing that it significantly outperforms planning using primitive actions and hand-crafted macro-actions in both simulation and on a real robot.

翻译：当机器人在现实世界中运作时,他们需要处理在感测、行为和环境动态方面的不确定性。许多任务也需要对机器人决定的长期后果进行推理。部分可见的Markov决策程序(POMDP)提供了一种在不确定情况下进行规划的原则性方法。然而,其计算复杂性随着规划地平线而成倍增长。我们提议使用时间延伸的宏观行动来缩小有效的规划视野,从而缩小复杂性的指数性系数。我们提议了宏观行动发电机-批评(MAGIC),这是一种算法,它利用规划者的反馈来学习宏观行动生成器,而反过来又利用所学的宏观行动来决定长期的模拟和真正的机器人规划条件。重要的是,该生成器学会直接最大限度地提高下游规划绩效。我们在若干长期规划任务中评估了MAGIC,显示它在模拟和真正的机器人中都大大超过使用原始行动和手工制作的宏观行动进行规划的效果。

0

相关内容

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

专知会员服务

76+阅读 · 2021年1月23日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

最新！Yann Lecun 纽约大学Spring2020深度学习课程，附PPT下载

最新！Yann Lecun 纽约大学Spring2020深度学习课程，附PPT下载

专知会员服务

47+阅读 · 2020年1月28日

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

专知会员服务

105+阅读 · 2019年11月2日

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

专知会员服务

13+阅读 · 2019年10月31日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

GAN新书《生成式深度学习》，Generative Deep Learning，379页pdf

GAN新书《生成式深度学习》，Generative Deep Learning，379页pdf

专知会员服务

207+阅读 · 2019年9月30日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Machine Learning for Nondestructive Wear Assessment in Large Internal Combustion Engines

Arxiv

0+阅读 · 2021年6月15日

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

Arxiv

0+阅读 · 2021年6月15日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

1+阅读 · 2021年6月14日

Online Sub-Sampling for Reinforcement Learning with General Function Approximation

Arxiv

0+阅读 · 2021年6月14日

Generalizable Episodic Memory for Deep Reinforcement Learning

Arxiv

0+阅读 · 2021年6月11日

DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Text Generation

Arxiv

0+阅读 · 2021年6月1日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

Path Planning using Neural A* Search

Arxiv

5+阅读 · 2021年2月8日

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年6月18日

PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making

Arxiv

6+阅读 · 2018年4月20日

VIP会员

文章信息

相关主题

部分可观测马尔可夫决策过程

相关VIP内容

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

专知会员服务

76+阅读 · 2021年1月23日

深度学习搜索，Exploring Deep Learning for Search

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

最新！Yann Lecun 纽约大学Spring2020深度学习课程，附PPT下载

最新！Yann Lecun 纽约大学Spring2020深度学习课程，附PPT下载

专知会员服务

47+阅读 · 2020年1月28日

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

专知会员服务

105+阅读 · 2019年11月2日

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

【ICCV 2019 Workshop】Adaptive Confidence Smoothing for Generalized Zero-Shot Learning，巴伊兰大学 Yuval Atzmon

专知会员服务

13+阅读 · 2019年10月31日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

GAN新书《生成式深度学习》，Generative Deep Learning，379页pdf

GAN新书《生成式深度学习》，Generative Deep Learning，379页pdf

专知会员服务

207+阅读 · 2019年9月30日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Machine Learning for Nondestructive Wear Assessment in Large Internal Combustion Engines

Arxiv

0+阅读 · 2021年6月15日

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

Arxiv

0+阅读 · 2021年6月15日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

1+阅读 · 2021年6月14日

Online Sub-Sampling for Reinforcement Learning with General Function Approximation

Arxiv

0+阅读 · 2021年6月14日

Generalizable Episodic Memory for Deep Reinforcement Learning

Arxiv

0+阅读 · 2021年6月11日

DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Text Generation

Arxiv

0+阅读 · 2021年6月1日

Learning and Planning in Complex Action Spaces

Arxiv

4+阅读 · 2021年4月13日

Path Planning using Neural A* Search

Arxiv

5+阅读 · 2021年2月8日

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年6月18日

PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making

Arxiv

6+阅读 · 2018年4月20日

微信扫码咨询专知VIP会员