从腐败示威中学习 (Robust Imitation Learning from Corrupted Demonstrations) - 专知论文

会员服务 ·

0

稳健性 · 学成 · 样本复杂度 · 估计/估计量 · 情景 ·

2022 年 1 月 29 日

Robust Imitation Learning from Corrupted Demonstrations

翻译：从腐败示威中学习

Liu Liu,Ziyang Tang,Lanqing Li,Dijun Luo

We consider offline Imitation Learning from corrupted demonstrations where a constant fraction of data can be noise or even arbitrary outliers. Classical approaches such as Behavior Cloning assumes that demonstrations are collected by an presumably optimal expert, hence may fail drastically when learning from corrupted demonstrations. We propose a novel robust algorithm by minimizing a Median-of-Means (MOM) objective which guarantees the accurate estimation of policy, even in the presence of constant fraction of outliers. Our theoretical analysis shows that our robust method in the corrupted setting enjoys nearly the same error scaling and sample complexity guarantees as the classical Behavior Cloning in the expert demonstration setting. Our experiments on continuous-control benchmarks validate that our method exhibits the predicted robustness and effectiveness, and achieves competitive results compared to existing imitation learning methods.

翻译：我们考虑从腐败的示威中脱线的学习,因为腐败的数据的固定部分可以是噪音,甚至任意的离线。行为克隆等古老方法假定示威是由假定的最佳专家收集的,因此在从腐败的示威中学习时可能会大失所望。我们提出一种新的稳健的算法,将中中值目标最小化,保证准确估计政策,即使存在常数的离线者。我们的理论分析表明,我们腐败环境中的稳健方法拥有与专家示范环境中的典型的Behavior克隆几乎相同的误差比例和样本复杂性保障。我们关于持续控制基准的实验证实,我们的方法展示了预测的稳健性和有效性,并取得了与现有模仿学习方法相比的竞争结果。

0

相关内容

稳健性

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

专知会员服务

20+阅读 · 2021年11月13日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Zero-Shot Learning相关资源大列表

Zero-Shot Learning相关资源大列表

专知

52+阅读 · 2019年1月1日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

高分辨率极化SAR图像对象化目标分解方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

电磁矢量传感器阵列信号处理理论与方法

国家自然科学基金

0+阅读 · 2013年12月31日

分层视觉模型及表观复杂变化的视觉目标跟踪方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于上下文协作、多级观测和数据关联的复杂场景多目标跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

基于超顺磁聚类和图割的复杂红外成像目标自动检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

弱观测复杂海洋环境下AUV动态目标跟踪算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

目标运动突变和几何外观急剧变化的视觉跟踪

国家自然科学基金

0+阅读 · 2012年12月31日

基于目标与混响散射特性的特征空间联合建模与分离

国家自然科学基金

0+阅读 · 2012年12月31日

智能摄像机传感网络分布式数据关联方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

SAR图像二次成像

国家自然科学基金

5+阅读 · 2008年12月31日

Mutual Consistency Learning for Semi-supervised Medical Image Segmentation

Arxiv

0+阅读 · 2022年4月20日

When Is Partially Observable Reinforcement Learning Not Scary?

Arxiv

0+阅读 · 2022年4月19日

Imitating, Fast and Slow: Robust learning from demonstrations via decision-time planning

Arxiv

0+阅读 · 2022年4月18日

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Arxiv

0+阅读 · 2022年4月18日

Evaluating the Effectiveness of Corrective Demonstrations and a Low-Cost Sensor for Dexterous Manipulation

Arxiv

0+阅读 · 2022年4月15日

Divide & Conquer Imitation Learning

Arxiv

0+阅读 · 2022年4月15日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

GCR: Gradient Coreset Based Replay Buffer Selection For Continual Learning

Arxiv

0+阅读 · 2022年4月15日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

36+阅读 · 2020年9月3日

VIP会员

文章信息

相关主题

样本复杂度

估计/估计量

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

NeurIPS2021 | Cycle Self-Training：领域自适应的循环自训练方法与理论

专知会员服务

20+阅读 · 2021年11月13日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

图解FixMatch的半监督学习，The Illustrated FixMatch for Semi-Supervised Learning

专知会员服务

26+阅读 · 2020年4月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Zero-Shot Learning相关资源大列表

Zero-Shot Learning相关资源大列表

专知

52+阅读 · 2019年1月1日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Mutual Consistency Learning for Semi-supervised Medical Image Segmentation

Arxiv

0+阅读 · 2022年4月20日

When Is Partially Observable Reinforcement Learning Not Scary?

Arxiv

0+阅读 · 2022年4月19日

Imitating, Fast and Slow: Robust learning from demonstrations via decision-time planning

Arxiv

0+阅读 · 2022年4月18日

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Arxiv

0+阅读 · 2022年4月18日

Evaluating the Effectiveness of Corrective Demonstrations and a Low-Cost Sensor for Dexterous Manipulation

Arxiv

0+阅读 · 2022年4月15日

Divide & Conquer Imitation Learning

Arxiv

0+阅读 · 2022年4月15日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

GCR: Gradient Coreset Based Replay Buffer Selection For Continual Learning

Arxiv

0+阅读 · 2022年4月15日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

36+阅读 · 2020年9月3日

相关基金

高分辨率极化SAR图像对象化目标分解方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

电磁矢量传感器阵列信号处理理论与方法

国家自然科学基金

0+阅读 · 2013年12月31日

分层视觉模型及表观复杂变化的视觉目标跟踪方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于上下文协作、多级观测和数据关联的复杂场景多目标跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

基于超顺磁聚类和图割的复杂红外成像目标自动检测方法

国家自然科学基金

0+阅读 · 2013年12月31日

弱观测复杂海洋环境下AUV动态目标跟踪算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

目标运动突变和几何外观急剧变化的视觉跟踪

国家自然科学基金

0+阅读 · 2012年12月31日

基于目标与混响散射特性的特征空间联合建模与分离

国家自然科学基金

0+阅读 · 2012年12月31日

智能摄像机传感网络分布式数据关联方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

SAR图像二次成像

国家自然科学基金

5+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员