混合顶性吸收学习 (Hybrid Adversarial Imitation Learning) - 专知论文

会员服务 ·

0

Performer · 奖励函数 · 学成 · Extensibility · 泛函导数 ·

2021 年 10 月 29 日

Hybrid Adversarial Imitation Learning

翻译：混合顶性吸收学习

from arxiv, 6 pages, 2 figures

Extrapolating beyond-demonstrator (BD) performance through the imitation learning (IL) algorithm aims to learn from and outperform the demonstrator. Most existing BDIL algorithms are performed in two stages by first inferring a reward function before learning a policy via reinforcement learning (RL). However, such two-stage BDIL algorithms suffer from high computational complexity, weak robustness, and large performance variations. In particular, a poor reward function derived in the first stage will inevitably incur severe performance loss in the second stage. In this work, we propose a hybrid adversarial imitation learning (HAIL) algorithm that is one-stage, model-free, generative-adversarial (GA) fashion and curiosity-driven. Thanks to the one-stage design, the HAIL can integrate both the reward function learning and the policy optimization into one procedure, which leads to many advantages such as low computational complexity, high robustness, and strong adaptability. More specifically, HAIL simultaneously imitates the demonstrator and explores BD performance by utilizing hybrid rewards. Extensive simulation results confirm that HAIL can achieve higher performance as compared to other similar BDIL algorithms.

翻译：通过模仿学习(IL)算法外推外推法外推法外演化(BD)性能通过模仿学习(IL)算法向演示人学习并优于演示人。多数现有的BDIL算法在通过强化学习(RL)学习一项政策之前先先推奖性功能,然后先推评奖励性功能。然而,这种两阶段的BDIL算法在计算复杂性高、稳健性强、性能差异大等方面都存在问题。特别是,第一阶段所得的微弱奖励性功能在第二阶段必然会造成严重的性能损失。在这项工作中,我们建议采用一种混合性对抗性模拟学(HAIL)算法,这种混合性模拟性(HAIL)算法是单阶段的、无模式的、基因-对抗性(GA)时尚的和好奇性驱动的。由于单阶段设计,HAIL可以将奖励性学习和政策优化纳入一个程序,从而带来许多好处,例如低计算性、强健性和强大的适应性。更具体地说,HAIL同时模仿示范性,并利用混合奖赏来探索BD的性能。广泛的模拟结果证实,HAILL能够比其他类似的BLDAD演算法性更高。

0

相关内容

Performer

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

专知会员服务

87+阅读 · 2021年1月11日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【综述】图像去噪的深度学习:综述，36页pdf，Deep Learning on Image Denoising: An overview

【综述】图像去噪的深度学习:综述，36页pdf，Deep Learning on Image Denoising: An overview

专知会员服务

71+阅读 · 2019年12月31日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

State-Only Imitation Learning for Dexterous Manipulation

Arxiv

0+阅读 · 2021年12月29日

Associative Adversarial Learning Based on Selective Attack

Arxiv

0+阅读 · 2021年12月28日

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

Contrastive Learning with Adversarial Examples

Arxiv

5+阅读 · 2020年10月22日

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Arxiv

3+阅读 · 2020年7月3日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Generative Adversarial Networks: A Survey and Taxonomy

Generative Adversarial Networks: A Survey and Taxonomy

Arxiv

14+阅读 · 2019年6月4日

Adversarial Transfer Learning

Adversarial Transfer Learning

Arxiv

12+阅读 · 2018年12月6日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

专知会员服务

87+阅读 · 2021年1月11日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【综述】图像去噪的深度学习:综述，36页pdf，Deep Learning on Image Denoising: An overview

【综述】图像去噪的深度学习:综述，36页pdf，Deep Learning on Image Denoising: An overview

专知会员服务

71+阅读 · 2019年12月31日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军特种作战条令》最新102页

《洛克希德SR-71“黑鸟”侦察机动力系统》21页slides

美空军作战实验室通过人工智能和指挥控制技术创新推进杀伤链

《指挥控制能力分析方法论》最新报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

State-Only Imitation Learning for Dexterous Manipulation

Arxiv

0+阅读 · 2021年12月29日

Associative Adversarial Learning Based on Selective Attack

Arxiv

0+阅读 · 2021年12月28日

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

Contrastive Learning with Adversarial Examples

Arxiv

5+阅读 · 2020年10月22日

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Arxiv

3+阅读 · 2020年7月3日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Generative Adversarial Networks: A Survey and Taxonomy

Generative Adversarial Networks: A Survey and Taxonomy

Arxiv

14+阅读 · 2019年6月4日

Adversarial Transfer Learning

Adversarial Transfer Learning

Arxiv

12+阅读 · 2018年12月6日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

微信扫码咨询专知VIP会员