通过从示范活动中学习创造反逆向自我消化技术,采取学习类级一般可实现的一般物体调整政策 (Learning Category-Level Generalizable Object Manipulation Policy via Generative Adversarial Self-Imitation Learning from Demonstrations) - 专知论文

会员服务 ·

0

Learning · 评论员 · 可辨认的 · 泛化理论 · Buffer（公司） ·

2022 年 9 月 13 日

Learning Category-Level Generalizable Object Manipulation Policy via Generative Adversarial Self-Imitation Learning from Demonstrations

翻译：通过从示范活动中学习创造反逆向自我消化技术,采取学习类级一般可实现的一般物体调整政策

Hao Shen,Weikang Wan,He Wang

Generalizable object manipulation skills are critical for intelligent and multi-functional robots to work in real-world complex scenes. Despite the recent progress in reinforcement learning, it is still very challenging to learn a generalizable manipulation policy that can handle a category of geometrically diverse articulated objects. In this work, we tackle this category-level object manipulation policy learning problem via imitation learning in a task-agnostic manner, where we assume no handcrafted dense rewards but only a terminal reward. Given this novel and challenging generalizable policy learning problem, we identify several key issues that can fail the previous imitation learning algorithms and hinder the generalization to unseen instances. We then propose several general but critical techniques, including generative adversarial self-imitation learning from demonstrations, progressive growing of discriminator, and instance-balancing for expert buffer, that accurately pinpoints and tackles these issues and can benefit category-level manipulation policy learning regardless of the tasks. Our experiments on ManiSkill benchmarks demonstrate a remarkable improvement on all tasks and our ablation studies further validate the contribution of each proposed technique.

翻译：通用的物体操纵技能对于智能和多功能机器人在现实世界复杂场景中工作至关重要。尽管最近加强学习工作取得了进展,但学习一个通用的操纵政策仍然非常困难,该政策能够处理几何形形形色色的分解对象。在这项工作中,我们通过模仿学习,以任务不可知的方式解决这一类别级物体操纵政策学习问题,我们不承担手工制作的密集奖赏,而只承担终极奖赏。鉴于这个新颖和具有挑战性的通用政策学习问题,我们找出了几个关键问题,这些问题可能使先前的模仿学习算法失败,阻碍将常规化为看不见的实例。我们然后提出了若干一般性但重要的技术,包括从演示中学习自缩自定义、逐步增加歧视者、为专家缓冲提供例平衡,从而准确地定位和处理这些问题,并有利于分类操纵政策学习,而不论任务如何。我们关于曼斯证明基准的实验表明所有任务都取得了显著的改进,而且我们的消化研究进一步验证了每一项拟议技术的贡献。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【单样本(One-shot)学习】《One-shot learning》by Pragati Baheti Part 1/2: Definitions and fundamental techniques

【单样本(One-shot)学习】《One-shot learning》by Pragati Baheti Part 1/2: Definitions and fundamental techniques

专知会员服务

30+阅读 · 2020年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

联合高光谱红外探测仪和光谱成像仪同步反演温湿廓线和云参数的研究

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

基于A-Train卫星观测的沙尘暴数字重构技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

非线性混杂系统中的非光滑优化理论与算法

国家自然科学基金

0+阅读 · 2012年12月31日

Fucí意义下的跨共振的Sturm-Liouville问题

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ拮抗Egr-1对增生性瘢痕TGF-β1促纤维化信号的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

极地海洋边界层碘的环境地球化学

国家自然科学基金

0+阅读 · 2011年12月31日

等离子体助离子液体中可磁分离TiO2形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

晚全新世以来中亚咸海湖泊粉尘记录与气候环境变化研究

国家自然科学基金

0+阅读 · 2009年12月31日

由Lé过程驱动的随机时滞偏微分方程的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Benchmarking Deformable Object Manipulation with Differentiable Physics

Arxiv

0+阅读 · 2022年10月24日

Active Exploration for Robotic Manipulation

Arxiv

0+阅读 · 2022年10月23日

Neural Fields for Robotic Object Manipulation from a Single Image

Arxiv

0+阅读 · 2022年10月21日

Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data

Arxiv

0+阅读 · 2022年10月21日

VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors

VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors

Arxiv

0+阅读 · 2022年10月20日

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Arxiv

42+阅读 · 2020年12月21日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

VIP会员

文章信息

相关主题

Buffer（公司）

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【单样本(One-shot)学习】《One-shot learning》by Pragati Baheti Part 1/2: Definitions and fundamental techniques

【单样本(One-shot)学习】《One-shot learning》by Pragati Baheti Part 1/2: Definitions and fundamental techniques

专知会员服务

30+阅读 · 2020年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从面向科学的人工智能到智能体科学：自主科学发现综述

面向具身智能的多模态数据存储与检索：综述

【斯坦福博士论文】迈向用于序列建模的结构化智能

【CMU博士论文】水下三维视觉感知与生成

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Benchmarking Deformable Object Manipulation with Differentiable Physics

Arxiv

0+阅读 · 2022年10月24日

Active Exploration for Robotic Manipulation

Arxiv

0+阅读 · 2022年10月23日

Neural Fields for Robotic Object Manipulation from a Single Image

Arxiv

0+阅读 · 2022年10月21日

Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data

Arxiv

0+阅读 · 2022年10月21日

VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors

VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors

Arxiv

0+阅读 · 2022年10月20日

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Arxiv

42+阅读 · 2020年12月21日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

Self-supervised Learning: Generative or Contrastive

Arxiv

19+阅读 · 2020年7月21日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

相关基金

联合高光谱红外探测仪和光谱成像仪同步反演温湿廓线和云参数的研究

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

基于A-Train卫星观测的沙尘暴数字重构技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

非线性混杂系统中的非光滑优化理论与算法

国家自然科学基金

0+阅读 · 2012年12月31日

Fucí意义下的跨共振的Sturm-Liouville问题

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ拮抗Egr-1对增生性瘢痕TGF-β1促纤维化信号的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

极地海洋边界层碘的环境地球化学

国家自然科学基金

0+阅读 · 2011年12月31日

等离子体助离子液体中可磁分离TiO2形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

晚全新世以来中亚咸海湖泊粉尘记录与气候环境变化研究

国家自然科学基金

0+阅读 · 2009年12月31日

由Lé过程驱动的随机时滞偏微分方程的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员