通过离线至联机强化学习和目标认知国家信息,通过离线至联机强化学习和目标认知国家信息,高效机器人操纵 (Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information) - 专知论文

会员服务 ·

0

INFORMS · Performer · 学成 · 可约的 · 表示学习 ·

2021 年 10 月 21 日

Efficient Robotic Manipulation Through Offline-to-Online Reinforcement Learning and Goal-Aware State Information

翻译：通过离线至联机强化学习和目标认知国家信息,通过离线至联机强化学习和目标认知国家信息,高效机器人操纵

Jin Li,Xianyuan Zhan,Zixu Xiao,Guyue Zhou

End-to-end learning robotic manipulation with high data efficiency is one of the key challenges in robotics. The latest methods that utilize human demonstration data and unsupervised representation learning has proven to be a promising direction to improve RL learning efficiency. The use of demonstration data also allows "warming-up" the RL policies using offline data with imitation learning or the recently emerged offline reinforcement learning algorithms. However, existing works often treat offline policy learning and online exploration as two separate processes, which are often accompanied by severe performance drop during the offline-to-online transition. Furthermore, many robotic manipulation tasks involve complex sub-task structures, which are very challenging to be solved in RL with sparse reward. In this work, we propose a unified offline-to-online RL framework that resolves the transition performance drop issue. Additionally, we introduce goal-aware state information to the RL agent, which can greatly reduce task complexity and accelerate policy learning. Combined with an advanced unsupervised representation learning module, our framework achieves great training efficiency and performance compared with the state-of-the-art methods in multiple robotic manipulation tasks.

翻译：高数据效率的端到端学习机器人操作是机器人的关键挑战之一。使用人类演示数据和无人监督的代理学习的最新方法已证明是提高RL学习效率的一个大有希望的方向。使用演示数据还允许使用模拟学习或最近出现的离线强化学习算法的离线数据“升温”RL政策,模拟学习或最近出现的离线强化学习算法。然而,现有工作往往将离线政策学习和在线探索作为两个不同的过程处理,这些过程往往伴随着离线到在线过渡期间业绩严重下降。此外,许多机器人操作工作涉及复杂的子任务结构,在RL很难以微薄的奖励解决这些任务。在这项工作中,我们提议建立一个统一的离线到线RL框架,解决过渡性业绩下降问题。此外,我们向RL代理引入目标识别状态信息,这可以大大降低任务复杂性并加速政策学习。与先进的不受监控的代理学习模块相结合,我们的框架在多个机器人操纵任务中实现了与最先进的方法相比,培训效率和绩效很高。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

专知会员服务

44+阅读 · 2019年11月19日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation

Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation

Arxiv

0+阅读 · 2021年12月16日

Learning to Share in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年12月16日

Return-Based Contrastive Representation Learning for Reinforcement Learning

Arxiv

10+阅读 · 2021年2月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Hierarchical Deep Multiagent Reinforcement Learning

Hierarchical Deep Multiagent Reinforcement Learning

Arxiv

8+阅读 · 2018年9月25日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Relational Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年6月5日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Sim-to-Real Optimization of Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play

Arxiv

4+阅读 · 2018年4月17日

VIP会员

文章信息

相关主题

相关VIP内容

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

专知会员服务

44+阅读 · 2019年11月19日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation

Centralizing State-Values in Dueling Networks for Multi-Robot Reinforcement Learning Mapless Navigation

Arxiv

0+阅读 · 2021年12月16日

Learning to Share in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2021年12月16日

Return-Based Contrastive Representation Learning for Reinforcement Learning

Arxiv

10+阅读 · 2021年2月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Hierarchical Deep Multiagent Reinforcement Learning

Hierarchical Deep Multiagent Reinforcement Learning

Arxiv

8+阅读 · 2018年9月25日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Relational Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年6月5日

Hierarchical Reinforcement Learning with Deep Nested Agents

Arxiv

3+阅读 · 2018年5月18日

Sim-to-Real Optimization of Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play

Arxiv

4+阅读 · 2018年4月17日

微信扫码咨询专知VIP会员