通过优化从成功和失败的演示中学习 (Learning from Successful and Failed Demonstrations via Optimization) - 专知论文

会员服务 ·

0

优化器 · 学成 · Performer · 情景 · 散度 ·

2021 年 7 月 26 日

Learning from Successful and Failed Demonstrations via Optimization

翻译：通过优化从成功和失败的演示中学习

Brendan Hertel,S. Reza Ahmadzadeh

from arxiv, 6 pages, 7 figures. Accepted to IROS 2021. Code available at https://github.com/brenhertel/TLFSD Accompanying video at: https://youtu.be/sRdOm_9nJ8g

Learning from Demonstration (LfD) is a popular approach that allows humans to teach robots new skills by showing the correct way(s) of performing the desired skill. Human-provided demonstrations, however, are not always optimal and the teacher usually addresses this issue by discarding or replacing sub-optimal (noisy or faulty) demonstrations. We propose a novel LfD representation that learns from both successful and failed demonstrations of a skill. Our approach encodes the two subsets of captured demonstrations (labeled by the teacher) into a statistical skill model, constructs a set of quadratic costs, and finds an optimal reproduction of the skill under novel problem conditions (i.e. constraints). The optimal reproduction balances convergence towards successful examples and divergence from failed examples. We evaluate our approach through several 2D and 3D experiments in real-world using a UR5e manipulator arm and also show that it can reproduce a skill from only failed demonstrations. The benefits of exploiting both failed and successful demonstrations are shown through comparison with two existing LfD approaches. We also compare our approach against an existing skill refinement method and show its capabilities in a multi-coordinate setting.

翻译：从演示中学习(LfD)是一种流行的方法,它使人类能够通过展示正确的方式来传授机器人新的技能,从而展示出所需的技能。然而,人类提供的演示并非总是最理想的,教师通常通过丢弃或取代亚最佳(噪音或错误)的演示来解决这一问题。我们建议了一种新型的LfD代表方式,既学习成功又学习失败的演示技能。我们的方法将被捕获的演示的两个子集(教师贴上标签)编码成统计技能模型,建立一套等式成本,并在新的问题条件下(即制约)找到最佳技能复制。最佳复制方式平衡了成功范例和与失败实例的差异。我们用UR5操纵臂评估了现实世界中的多个2D和3D实验方法,还表明它只能复制失败的演示技能。通过与现有的两种LfD方法进行比较,展示了利用失败和成功演示的好处。我们还比较了我们的方法与现有技能改进方法的对比,并展示了在多坐标设置中的能力。

0

相关内容

优化器

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

极市平台

4+阅读 · 2018年4月19日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

Arxiv

1+阅读 · 2021年9月23日

Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Arxiv

0+阅读 · 2021年9月22日

GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints

Arxiv

0+阅读 · 2021年9月22日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

Learning a Robust Society of Tracking Parts using Co-occurrence Constraints

Arxiv

4+阅读 · 2018年4月5日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

Discriminability objective for training descriptive captions

Arxiv

9+阅读 · 2018年3月12日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能驾驶：旧理念与新技术

美军手册：战术心理战分遣队与小组指南 | 68页

军事机器学习设计：关于开发自动化任务摘要系统的梯次化设计科学研究 | 2025最新93页

美国防部自主系统研制试验与鉴定指南 | 2025年最新200页

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

IEEE2018|An Accurate and Real-time 3D Tracking System for Robots

极市平台

4+阅读 · 2018年4月19日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

Arxiv

1+阅读 · 2021年9月23日

Making Human-Like Trade-offs in Constrained Environments by Learning from Demonstrations

Arxiv

0+阅读 · 2021年9月22日

GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints

Arxiv

0+阅读 · 2021年9月22日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

Learning a Robust Society of Tracking Parts using Co-occurrence Constraints

Arxiv

4+阅读 · 2018年4月5日

Low-Shot Learning from Imaginary Data

Arxiv

15+阅读 · 2018年4月3日

Discriminability objective for training descriptive captions

Arxiv

9+阅读 · 2018年3月12日

微信扫码咨询专知VIP会员