从利用优势加权和提前终止的示范中学习机器人政策 (Robot Policy Learning from Demonstration Using Advantage Weighting and Early Termination) - 专知论文

会员服务 ·

0

Learning · Weight · Agent · 机器人 · Performer ·

2022 年 7 月 31 日

Robot Policy Learning from Demonstration Using Advantage Weighting and Early Termination

翻译：从利用优势加权和提前终止的示范中学习机器人政策

Abdalkarim Mohtasib,Gerhard Neumann,Heriberto Cuayahuitl

Learning robotic tasks in the real world is still highly challenging and effective practical solutions remain to be found. Traditional methods used in this area are imitation learning and reinforcement learning, but they both have limitations when applied to real robots. Combining reinforcement learning with pre-collected demonstrations is a promising approach that can help in learning control policies to solve robotic tasks. In this paper, we propose an algorithm that uses novel techniques to leverage offline expert data using offline and online training to obtain faster convergence and improved performance. The proposed algorithm (AWET) weights the critic losses with a novel agent advantage weight to improve over the expert data. In addition, AWET makes use of an automatic early termination technique to stop and discard policy rollouts that are not similar to expert trajectories -- to prevent drifting far from the expert data. In an ablation study, AWET showed improved and promising performance when compared to state-of-the-art baselines on four standard robotic tasks.

翻译：在现实世界中,学习机器人的任务仍然极具挑战性,还有待找到有效的实际解决办法。该领域使用的传统方法是模仿学习和强化学习,但在应用到真正的机器人时,两者都有局限性。将强化学习与预收集的演示相结合是一种很有希望的方法,有助于学习控制政策,解决机器人的任务。在本文中,我们建议采用一种算法,利用离线和在线培训利用离线专家数据,以获得更快的趋同和改进性能。拟议的算法(AWET)将批评性损失与新颖的代理商优势权重加权起来,以改进专家数据。此外,AWET还利用自动早期终止技术来阻止和抛弃与专家轨迹不相类似的政策推出,以防止与专家数据相去的远处。在一项膨胀研究中,AWET在四种标准机器人任务上,与最先进的基线相比,表现更好,有希望。

0

相关内容

Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

基于trityl自由基的超氧自旋捕捉剂的分子设计、合成及性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

金属/石墨烯纳米天线探针SERS与SEIRA光谱双增强机理及关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

纳米金属微观结构不稳定性机理的三维研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维稀疏统计模型中的变量选择与检验

国家自然科学基金

1+阅读 · 2014年12月31日

均衡问题解集性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

高温颗粒流绕流换热管束的流动与传热特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

大变形结构无网格拓扑优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SAS数据的水下复杂场景中目标识别研究

国家自然科学基金

1+阅读 · 2012年12月31日

无定形Al2O3-SiO2/莫来石晶须软陶瓷设计及其塑性变形机制

国家自然科学基金

0+阅读 · 2011年12月31日

Intercepting A Flying Target While Avoiding Moving Obstacles: A Unified Control Framework With Deep Manifold Learning

Arxiv

0+阅读 · 2022年9月27日

Regularized Soft Actor-Critic for Behavior Transfer Learning

Arxiv

0+阅读 · 2022年9月27日

Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments

Arxiv

0+阅读 · 2022年9月26日

Dynamically Avoiding Amorphous Obstacles with Topological Manifold Learning and Deep Autoencoding

Dynamically Avoiding Amorphous Obstacles with Topological Manifold Learning and Deep Autoencoding

Arxiv

0+阅读 · 2022年9月26日

Advanced Skills by Learning Locomotion and Local Navigation End-to-End

Advanced Skills by Learning Locomotion and Local Navigation End-to-End

Arxiv

0+阅读 · 2022年9月26日

Unsupervised Reward Shaping for a Robotic Sequential Picking Task from Visual Observations in a Logistics Scenario

Arxiv

0+阅读 · 2022年9月25日

STEADY: Simultaneous State Estimation and Dynamics Learning from Indirect Observations

Arxiv

0+阅读 · 2022年9月23日

Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年9月22日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Intercepting A Flying Target While Avoiding Moving Obstacles: A Unified Control Framework With Deep Manifold Learning

Arxiv

0+阅读 · 2022年9月27日

Regularized Soft Actor-Critic for Behavior Transfer Learning

Arxiv

0+阅读 · 2022年9月27日

Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments

Arxiv

0+阅读 · 2022年9月26日

Dynamically Avoiding Amorphous Obstacles with Topological Manifold Learning and Deep Autoencoding

Dynamically Avoiding Amorphous Obstacles with Topological Manifold Learning and Deep Autoencoding

Arxiv

0+阅读 · 2022年9月26日

Advanced Skills by Learning Locomotion and Local Navigation End-to-End

Advanced Skills by Learning Locomotion and Local Navigation End-to-End

Arxiv

0+阅读 · 2022年9月26日

Unsupervised Reward Shaping for a Robotic Sequential Picking Task from Visual Observations in a Logistics Scenario

Arxiv

0+阅读 · 2022年9月25日

STEADY: Simultaneous State Estimation and Dynamics Learning from Indirect Observations

Arxiv

0+阅读 · 2022年9月23日

Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年9月22日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

基于trityl自由基的超氧自旋捕捉剂的分子设计、合成及性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

金属/石墨烯纳米天线探针SERS与SEIRA光谱双增强机理及关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

纳米金属微观结构不稳定性机理的三维研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维稀疏统计模型中的变量选择与检验

国家自然科学基金

1+阅读 · 2014年12月31日

均衡问题解集性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

高温颗粒流绕流换热管束的流动与传热特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

大变形结构无网格拓扑优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SAS数据的水下复杂场景中目标识别研究

国家自然科学基金

1+阅读 · 2012年12月31日

无定形Al2O3-SiO2/莫来石晶须软陶瓷设计及其塑性变形机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员