不需要互动:使用神经读取器进行强有力的基于模型的模拟模拟学习 (No Need for Interactions: Robust Model-Based Imitation Learning using Neural ODE) - 专知论文

会员服务 ·

0

稳健性 · 学成 · 控制器 · INTERACT · Performer ·

2021 年 4 月 3 日

No Need for Interactions: Robust Model-Based Imitation Learning using Neural ODE

翻译：不需要互动:使用神经读取器进行强有力的基于模型的模拟模拟学习

HaoChih Lin,Baopu Li,Xin Zhou,Jiankun Wang,Max Q. -H. Meng

Interactions with either environments or expert policies during training are needed for most of the current imitation learning (IL) algorithms. For IL problems with no interactions, a typical approach is Behavior Cloning (BC). However, BC-like methods tend to be affected by distribution shift. To mitigate this problem, we come up with a Robust Model-Based Imitation Learning (RMBIL) framework that casts imitation learning as an end-to-end differentiable nonlinear closed-loop tracking problem. RMBIL applies Neural ODE to learn a precise multi-step dynamics and a robust tracking controller via Nonlinear Dynamics Inversion (NDI) algorithm. Then, the learned NDI controller will be combined with a trajectory generator, a conditional VAE, to imitate an expert's behavior. Theoretical derivation shows that the controller network can approximate an NDI when minimizing the training loss of Neural ODE. Experiments on Mujoco tasks also demonstrate that RMBIL is competitive to the state-of-the-art generative adversarial method (GAIL) and achieves at least 30% performance gain over BC in uneven surfaces.

翻译：培训期间的大多数模拟学习( IL) 算法都需要与环境或专家政策互动。对于没有互动的 IL 算法,典型的方法是行为克隆( BC)。但是, BC 类方法往往会受到分布变化的影响。为了缓解这一问题,我们提出了一个模型模拟模拟模拟模拟模拟学习( RMBIL) 框架, 将模拟学习作为一种端到端的不同非线性闭路跟踪问题。 RMBIL 实验还表明, NUBIL 应用 NE 来通过非线性动态转换( NDI) 算法学习精确的多步动态和强力跟踪控制器。之后, 学习过的 NDI 控制器将与轨迹生成器( 有条件 VAE ) 合并, 以模拟专家的行为。理论衍生显示, 当最大限度地减少 Neal ODE 的培训损失时, 控制器网络可以接近 NDI 。 Mujoco 任务实验还表明, RMBIL 具有竞争力, 通过非线性基因对抗法( GAIL) 取得至少30%的成绩。

0

相关内容

稳健性

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

专知会员服务

45+阅读 · 2020年8月18日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

专知会员服务

46+阅读 · 2019年12月25日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【电子书|交互式线性代数】《Interactive Linear Algebra》by Dan Margalit, Joseph Rabinoff（附455页pdf）

【电子书|交互式线性代数】《Interactive Linear Algebra》by Dan Margalit, Joseph Rabinoff（附455页pdf）

专知会员服务

69+阅读 · 2019年11月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Hybrid Adversarial Inverse Reinforcement Learning

Arxiv

0+阅读 · 2021年5月28日

Robust Navigation for Racing Drones based on Imitation Learning and Modularization

Arxiv

1+阅读 · 2021年5月27日

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

Arxiv

3+阅读 · 2020年12月8日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Energy-Based Hindsight Experience Prioritization

Arxiv

3+阅读 · 2018年10月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

RankIQA: Learning from Rankings for No-reference Image Quality Assessment

Arxiv

3+阅读 · 2017年7月26日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

【2020Manning新书】微型化Python项目，325页pdf，Tiny Python Projects

专知会员服务

45+阅读 · 2020年8月18日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

【CVPR2020】用于图像超分辨率的深度展开网络，Deep Unfolding Network for Image Super-Resolution

专知会员服务

44+阅读 · 2020年3月26日

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

【论文推荐】深度学习中贝叶斯不确定性简单基线（A simple baseline for bayesian uncertainty in deep learning）

专知会员服务

46+阅读 · 2019年12月25日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【电子书|交互式线性代数】《Interactive Linear Algebra》by Dan Margalit, Joseph Rabinoff（附455页pdf）

【电子书|交互式线性代数】《Interactive Linear Algebra》by Dan Margalit, Joseph Rabinoff（附455页pdf）

专知会员服务

69+阅读 · 2019年11月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Hybrid Adversarial Inverse Reinforcement Learning

Arxiv

0+阅读 · 2021年5月28日

Robust Navigation for Racing Drones based on Imitation Learning and Modularization

Arxiv

1+阅读 · 2021年5月27日

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

Arxiv

3+阅读 · 2020年12月8日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

A sequential guiding network with attention for image captioning

A sequential guiding network with attention for image captioning

Arxiv

5+阅读 · 2019年2月8日

Energy-Based Hindsight Experience Prioritization

Arxiv

3+阅读 · 2018年10月8日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

RankIQA: Learning from Rankings for No-reference Image Quality Assessment

Arxiv

3+阅读 · 2017年7月26日

微信扫码咨询专知VIP会员