GRI:一般强化模拟及其应用于基于愿景的自主驱动 (GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving) - 专知论文

会员服务 ·

0

样本复杂度 · Continuity · state-of-the-art · 秩 · Processing（编程语言） ·

2021 年 11 月 16 日

GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving

翻译：GRI:一般强化模拟及其应用于基于愿景的自主驱动

Raphael Chekroun,Marin Toromanoff,Sascha Hornauer,Fabien Moutarde

Deep reinforcement learning (DRL) has been demonstrated to be effective for several complex decision-making applications such as autonomous driving and robotics. However, DRL is notoriously limited by its high sample complexity and its lack of stability. Prior knowledge, e.g. as expert demonstrations, is often available but challenging to leverage to mitigate these issues. In this paper, we propose General Reinforced Imitation (GRI), a novel method which combines benefits from exploration and expert data and is straightforward to implement over any off-policy RL algorithm. We make one simplifying hypothesis: expert demonstrations can be seen as perfect data whose underlying policy gets a constant high reward. Based on this assumption, GRI introduces the notion of offline demonstration agents. This agent sends expert data which are processed both concurrently and indistinguishably with the experiences coming from the online RL exploration agent. We show that our approach enables major improvements on vision-based autonomous driving in urban environments. We further validate the GRI method on Mujoco continuous control tasks with different off-policy RL algorithms. Our method ranked first on the CARLA Leaderboard and outperforms World on Rails, the previous state-of-the-art, by 17%.

翻译：深度强化学习(DRL)已证明对自主驾驶和机器人等若干复杂的决策应用(DRL)是有效的。然而,DRL由于其高样本复杂性和不稳定性而臭名昭著地受到限制。先前的知识,例如专家演示,往往可用,但很难解决这些问题。在本文件中,我们提议采用通用强化仿真(GRI)这一创新方法,将勘探和专家数据的好处结合起来,并直接用于执行任何非政策性RL算法。我们提出了一个简化的假设:专家演示可被视为完美数据,其基本政策不断获得高额奖励。基于这一假设,GRI提出了脱机示范剂的概念。该代理发送专家数据,这些数据既与在线RL勘探代理的经验同时处理,又不可分割。我们表明,我们的方法能够大大改进城市环境中基于愿景的自主驾驶。我们进一步验证了Mujoco连续控制任务GRI方法,同时使用不同的离政策RL算法。我们的方法在CARA头板上排名第一,在17岁的铁路上超越了世界。

0

相关内容

样本复杂度

样本复杂度

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

专知会员服务

122+阅读 · 2020年5月18日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

机器人开发库软件大列表

机器人开发库软件大列表

专知

10+阅读 · 2018年3月18日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

GAMMA: A General Agent Motion Model for Autonomous Driving

Arxiv

0+阅读 · 2022年1月19日

Ray Based Distributed Autonomous Vehicle Research Platform

Arxiv

0+阅读 · 2022年1月18日

Spatial State-Action Features for General Games

Arxiv

0+阅读 · 2022年1月17日

Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research Directions

Arxiv

18+阅读 · 2021年12月21日

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Arxiv

7+阅读 · 2019年7月11日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Arxiv

3+阅读 · 2018年12月7日

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Arxiv

9+阅读 · 2018年11月25日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

LaneNet: Real-Time Lane Detection Networks for Autonomous Driving

LaneNet: Real-Time Lane Detection Networks for Autonomous Driving

Arxiv

3+阅读 · 2018年7月4日

VIP会员

文章信息

相关主题

样本复杂度

state-of-the-art

Processing（编程语言）

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

【牛津大学博士论文】基于强化学习的无地图机器人导航，Reinforcement Learning Based MRN

专知会员服务

122+阅读 · 2020年5月18日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

机器人开发库软件大列表

机器人开发库软件大列表

专知

10+阅读 · 2018年3月18日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

GAMMA: A General Agent Motion Model for Autonomous Driving

Arxiv

0+阅读 · 2022年1月19日

Ray Based Distributed Autonomous Vehicle Research Platform

Arxiv

0+阅读 · 2022年1月18日

Spatial State-Action Features for General Games

Arxiv

0+阅读 · 2022年1月17日

Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research Directions

Arxiv

18+阅读 · 2021年12月21日

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Arxiv

7+阅读 · 2019年7月11日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

Arxiv

3+阅读 · 2018年12月7日

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

Arxiv

9+阅读 · 2018年11月25日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

LaneNet: Real-Time Lane Detection Networks for Autonomous Driving

LaneNet: Real-Time Lane Detection Networks for Autonomous Driving

Arxiv

3+阅读 · 2018年7月4日

微信扫码咨询专知VIP会员