利用强化学习教练,推动端至端城市发展 (End-to-End Urban Driving by Imitating a Reinforcement Learning Coach) - 专知论文

会员服务 ·

0

INFORMS · Performer · Automator · 端到端 · 学成 ·

2021 年 8 月 26 日

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

翻译：利用强化学习教练,推动端至端城市发展

Zhejun Zhang,Alexander Liniger,Dengxin Dai,Fisher Yu,Luc Van Gool

from arxiv, ICCV 2021

End-to-end approaches to autonomous driving commonly rely on expert demonstrations. Although humans are good drivers, they are not good coaches for end-to-end algorithms that demand dense on-policy supervision. On the contrary, automated experts that leverage privileged information can efficiently generate large scale on-policy and off-policy demonstrations. However, existing automated experts for urban driving make heavy use of hand-crafted rules and perform suboptimally even on driving simulators, where ground-truth information is available. To address these issues, we train a reinforcement learning expert that maps bird's-eye view images to continuous low-level actions. While setting a new performance upper-bound on CARLA, our expert is also a better coach that provides informative supervision signals for imitation learning agents to learn from. Supervised by our reinforcement learning coach, a baseline end-to-end agent with monocular camera-input achieves expert-level performance. Our end-to-end agent achieves a 78% success rate while generalizing to a new town and new weather on the NoCrash-dense benchmark and state-of-the-art performance on the more challenging CARLA LeaderBoard.

翻译：自主驾驶的端对端方法通常依赖专家演示。虽然人类是良好的驱动力,但他们不是需要密集政策监督的端对端算法的好导师。相反,利用特权信息的自动化专家可以有效地产生大规模政策和非政策演示。然而,现有的城市驾驶自动化专家大量使用手工制作的规则,甚至对有地面真实信息的驾驶模拟器也进行副最佳操作。为解决这些问题,我们培训了一名强化学习专家,该专家将鸟类的视景图像绘制成持续低水平行动图。在为CARLA设定新的高级性能的同时,我们的专家还是一名更好的导师,为模仿学习者学习提供信息性能的监督信号。在我们的强化学习教练的监督下,一个具有单层摄像作用的基线端对端代理实现了专家级业绩。我们的端对端代理获得了78 %的成功率,同时将诺克拉希敏度基准和更具有挑战性的CARA领导人的状态表现概括为一个新的城镇和新的天气。

1

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【深度学习基础原理十日谈】《10 Days Of Grad: Deep Learning From The First Principles》by Bogdan Penkovsky

专知会员服务

26+阅读 · 2020年1月23日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

Improving reinforcement learning algorithms: towards optimal learning rate policies

Arxiv

0+阅读 · 2021年10月17日

Generative Adversarial Imitation Learning for End-to-End Autonomous Driving on Urban Environments

Arxiv

0+阅读 · 2021年10月16日

Efficient Connected and Automated Driving Systemwith Multi-agent Graph Reinforcement Learning

Arxiv

0+阅读 · 2021年10月16日

Learning When and What to Ask: a Hierarchical Reinforcement Learning Framework

Arxiv

0+阅读 · 2021年10月14日

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

Arxiv

4+阅读 · 2021年9月9日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Relational Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年6月5日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

相关VIP内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【深度学习基础原理十日谈】《10 Days Of Grad: Deep Learning From The First Principles》by Bogdan Penkovsky

专知会员服务

26+阅读 · 2020年1月23日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

NeurIPS 2025 | 自动化所新作速览（一）

大型语言模型（LLM）赋能的知识图谱构建：综述

NeurIPS 2025 | 自动化所新作速览（二）

领域特定文本分类中的预训练语言模型新进展：系统综述

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

carla 体验效果及代码

carla 体验效果及代码

CreateAMind

7+阅读 · 2018年2月3日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Andrew NG的新书《Machine Learning Yearning》

Andrew NG的新书《Machine Learning Yearning》

我爱机器学习

11+阅读 · 2016年12月7日

相关论文

Improving reinforcement learning algorithms: towards optimal learning rate policies

Arxiv

0+阅读 · 2021年10月17日

Generative Adversarial Imitation Learning for End-to-End Autonomous Driving on Urban Environments

Arxiv

0+阅读 · 2021年10月16日

Efficient Connected and Automated Driving Systemwith Multi-agent Graph Reinforcement Learning

Arxiv

0+阅读 · 2021年10月16日

Learning When and What to Ask: a Hierarchical Reinforcement Learning Framework

Arxiv

0+阅读 · 2021年10月14日

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

Arxiv

4+阅读 · 2021年9月9日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Relational Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年6月5日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员