GRI:一般强化模拟及其应用于基于愿景的自主驱动 (GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving) - 专知论文

会员服务 ·

0

样本复杂度 · Continuity · 知识 (knowledge) · state-of-the-art · 秩 ·

2022 年 5 月 17 日

GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving

翻译：GRI:一般强化模拟及其应用于基于愿景的自主驱动

Raphael Chekroun,Marin Toromanoff,Sascha Hornauer,Fabien Moutarde

Deep reinforcement learning (DRL) has been demonstrated to be effective for several complex decision-making applications such as autonomous driving and robotics. However, DRL is notoriously limited by its high sample complexity and its lack of stability. Prior knowledge, e.g. as expert demonstrations, is often available but challenging to leverage to mitigate these issues. In this paper, we propose General Reinforced Imitation (GRI), a novel method which combines benefits from exploration and expert data and is straightforward to implement over any off-policy RL algorithm. We make one simplifying hypothesis: expert demonstrations can be seen as perfect data whose underlying policy gets a constant high reward. Based on this assumption, GRI introduces the notion of offline demonstration agents. This agent sends expert data which are processed both concurrently and indistinguishably with the experiences coming from the online RL exploration agent. We show that our approach enables major improvements on vision-based autonomous driving in urban environments. We further validate the GRI method on Mujoco continuous control tasks with different off-policy RL algorithms. Our method ranked first on the CARLA Leaderboard and outperforms World on Rails, the previous state-of-the-art, by 17%.

翻译：深度强化学习(DRL)已证明对自主驾驶和机器人等若干复杂的决策应用(DRL)是有效的。然而,DRL由于其高样本复杂性和不稳定性而臭名昭著地受到限制。先前的知识,例如专家演示,往往可用,但很难解决这些问题。在本文件中,我们提议采用通用强化仿真(GRI)这一创新方法,将勘探和专家数据的好处结合起来,并直接用于执行任何非政策性RL算法。我们提出了一个简化的假设:专家演示可被视为完美数据,其基本政策不断获得高额奖励。基于这一假设,GRI提出了脱机示范剂的概念。该代理发送专家数据,这些数据既与在线RL勘探代理的经验同时处理,又不可分割。我们表明,我们的方法能够大大改进城市环境中基于愿景的自主驾驶。我们进一步验证了Mujoco连续控制任务GRI方法,同时使用不同的离政策RL算法。我们的方法在CARA头板上排名第一,在17岁的铁路上超越了世界。

0

相关内容

样本复杂度

样本复杂度

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

求解非线性方程的加速迭代算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

流形上的Bakry-Emery曲率，泛函不等式和热核分析

国家自然科学基金

0+阅读 · 2012年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

半导体诱发的可见光-Fenton体系降解有机污染物研究

国家自然科学基金

0+阅读 · 2009年12月31日

Mather理论与Hamilton系统的不稳定性

国家自然科学基金

0+阅读 · 2008年12月31日

Safe Decision-making for Lane-change of Autonomous Vehicles via Human Demonstration-aided Reinforcement Learning

Safe Decision-making for Lane-change of Autonomous Vehicles via Human Demonstration-aided Reinforcement Learning

Arxiv

0+阅读 · 2022年7月7日

Offline Meta-Reinforcement Learning with Online Self-Supervision

Arxiv

1+阅读 · 2022年7月7日

Multi-objective Optimization of Notifications Using Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年7月7日

Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems through Probabilistic Model Checking

Arxiv

0+阅读 · 2022年7月6日

Tackling Real-World Autonomous Driving using Deep Reinforcement Learning

Tackling Real-World Autonomous Driving using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月5日

Discriminator-Guided Model-Based Offline Imitation Learning

Arxiv

0+阅读 · 2022年7月5日

FedChain: Chained Algorithms for Near-Optimal Communication Cost in Federated Learning

Arxiv

0+阅读 · 2022年7月4日

On games and simulators as a platform for development of artificial intelligence for command and control

On games and simulators as a platform for development of artificial intelligence for command and control

Arxiv

88+阅读 · 2021年10月21日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

样本复杂度

知识 (knowledge)

state-of-the-art

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

7500字 | 《实现压倒性优势：美陆军人工智能未来图景》附原文

中文版2500字 | 美陆军“项目融合”计划：推动大规模作战行动中的目标定位革新（附原文）

4300字《创新防御：生成式人工智能在美国军事演进中的角色》附原文

《重新思考军事战略：战斗消耗中暴露度研究》258页博士论文

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割

专知

25+阅读 · 2018年4月15日

相关论文

Safe Decision-making for Lane-change of Autonomous Vehicles via Human Demonstration-aided Reinforcement Learning

Safe Decision-making for Lane-change of Autonomous Vehicles via Human Demonstration-aided Reinforcement Learning

Arxiv

0+阅读 · 2022年7月7日

Offline Meta-Reinforcement Learning with Online Self-Supervision

Arxiv

1+阅读 · 2022年7月7日

Multi-objective Optimization of Notifications Using Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年7月7日

Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems through Probabilistic Model Checking

Arxiv

0+阅读 · 2022年7月6日

Tackling Real-World Autonomous Driving using Deep Reinforcement Learning

Tackling Real-World Autonomous Driving using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月5日

Discriminator-Guided Model-Based Offline Imitation Learning

Arxiv

0+阅读 · 2022年7月5日

FedChain: Chained Algorithms for Near-Optimal Communication Cost in Federated Learning

Arxiv

0+阅读 · 2022年7月4日

On games and simulators as a platform for development of artificial intelligence for command and control

On games and simulators as a platform for development of artificial intelligence for command and control

Arxiv

88+阅读 · 2021年10月21日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

求解非线性方程的加速迭代算法

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

流形上的Bakry-Emery曲率，泛函不等式和热核分析

国家自然科学基金

0+阅读 · 2012年12月31日

非凸Hamilton系统的Aubry-Mather理论

国家自然科学基金

0+阅读 · 2012年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

半导体诱发的可见光-Fenton体系降解有机污染物研究

国家自然科学基金

0+阅读 · 2009年12月31日

Mather理论与Hamilton系统的不稳定性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员