采样高效的反排水性模仿学习</s> (Sample-efficient Adversarial Imitation Learning) - 专知论文

会员服务 ·

0

Learning · Performer · 稳健性 · 样本 · 多样性 ·

2023 年 3 月 14 日

Sample-efficient Adversarial Imitation Learning

翻译：采样高效的反排水性模仿学习

Dahuin Jung,Hyungyu Lee,Sungroh Yoon

from arxiv, A preliminary version of this manuscript was presented at Deep RL Workshop, NeurIPS 2022

Imitation learning, in which learning is performed by demonstration, has been studied and advanced for sequential decision-making tasks in which a reward function is not predefined. However, imitation learning methods still require numerous expert demonstration samples to successfully imitate an expert's behavior. To improve sample efficiency, we utilize self-supervised representation learning, which can generate vast training signals from the given data. In this study, we propose a self-supervised representation-based adversarial imitation learning method to learn state and action representations that are robust to diverse distortions and temporally predictive, on non-image control tasks. In particular, in comparison with existing self-supervised learning methods for tabular data, we propose a different corruption method for state and action representations that is robust to diverse distortions. We theoretically and empirically observe that making an informative feature manifold with less sample complexity significantly improves the performance of imitation learning. The proposed method shows a 39% relative improvement over existing adversarial imitation learning methods on MuJoCo in a setting limited to 100 expert state-action pairs. Moreover, we conduct comprehensive ablations and additional experiments using demonstrations with varying optimality to provide insights into a range of factors.

翻译：以示范方式进行学习的模拟学习,对于没有预先界定奖赏功能的顺序决策任务,已经进行了研究和推进。但是,模仿学习方法仍然需要许多专家示范样本,才能成功地模仿专家的行为。为了提高抽样效率,我们利用自我监督的代表学习,这可以从给定数据中产生巨大的培训信号。在本研究中,我们提议一种以自我监督为基础的以代表为基础的对抗模拟学习方法,以学习适合不同扭曲和时间预测的关于非形象控制任务的国家和行动表现。特别是,与现有的以表列数据为主的自我监督学习方法相比,我们提出了一种不同的腐败方法,用于对不同扭曲情况具有强大的国家和行动表现。我们从理论上和从经验上认为,使信息性特征的多重性与抽样复杂性不那么复杂,可以大大改进模拟学习的绩效。拟议方法表明,比MuJoco公司现有的对抗性模拟学习方法有39%的相对改进,在限定为100个专家州-行动配。此外,我们利用不同最佳性的演示来提供一系列的洞察力。</s>

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

新型KSi储氢合金的制备及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Ni基金属陶瓷在CH4/O2混合气氛中氧化-还原及失效机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

纳米晶复合陶瓷材料抗辐照损伤机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ni3Al基金属间化合物多尺度本构模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

熔盐堆环境下结构材料辐照损伤机制及其高温熔盐腐蚀特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米粒子在复合物中分散性定量表征及与介电性关系

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

考虑微结构随机性的三维高阶MRCT多尺度计算理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

CaSO4载氧颗粒/固体燃料化学链燃烧过程氧传递机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

Verifiable Learning for Robust Tree Ensembles

Arxiv

0+阅读 · 2023年5月5日

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

Arxiv

0+阅读 · 2023年5月5日

Hierarchical Transformer for Scalable Graph Learning

Arxiv

1+阅读 · 2023年5月4日

IMAP: Intrinsically Motivated Adversarial Policy

Arxiv

0+阅读 · 2023年5月4日

Stream Efficient Learning

Stream Efficient Learning

Arxiv

0+阅读 · 2023年5月3日

Understanding Differential Search Index for Text Retrieval

Arxiv

0+阅读 · 2023年5月3日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Adversarial Transfer Learning

Adversarial Transfer Learning

Arxiv

12+阅读 · 2018年12月6日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Verifiable Learning for Robust Tree Ensembles

Arxiv

0+阅读 · 2023年5月5日

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

Arxiv

0+阅读 · 2023年5月5日

Hierarchical Transformer for Scalable Graph Learning

Arxiv

1+阅读 · 2023年5月4日

IMAP: Intrinsically Motivated Adversarial Policy

Arxiv

0+阅读 · 2023年5月4日

Stream Efficient Learning

Stream Efficient Learning

Arxiv

0+阅读 · 2023年5月3日

Understanding Differential Search Index for Text Retrieval

Arxiv

0+阅读 · 2023年5月3日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

Adversarial Transfer Learning

Adversarial Transfer Learning

Arxiv

12+阅读 · 2018年12月6日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

新型KSi储氢合金的制备及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Ni基金属陶瓷在CH4/O2混合气氛中氧化-还原及失效机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

纳米晶复合陶瓷材料抗辐照损伤机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ni3Al基金属间化合物多尺度本构模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

熔盐堆环境下结构材料辐照损伤机制及其高温熔盐腐蚀特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米粒子在复合物中分散性定量表征及与介电性关系

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

考虑微结构随机性的三维高阶MRCT多尺度计算理论研究

国家自然科学基金

0+阅读 · 2011年12月31日

CaSO4载氧颗粒/固体燃料化学链燃烧过程氧传递机理研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员