快速RLAP：通过深度RL和自主练习学习高速驾驶的系统 (FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing) - 专知论文

会员服务 ·

0

检查点 · 初始化 · 系统 · 演示 · 算法选择 ·

2023 年 4 月 19 日

FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing

翻译：快速RLAP：通过深度RL和自主练习学习高速驾驶的系统

Kyle Stachowicz,Dhruv Shah,Arjun Bhorkar,Ilya Kostrikov,Sergey Levine

We present a system that enables an autonomous small-scale RC car to drive aggressively from visual observations using reinforcement learning (RL). Our system, FastRLAP (faster lap), trains autonomously in the real world, without human interventions, and without requiring any simulation or expert demonstrations. Our system integrates a number of important components to make this possible: we initialize the representations for the RL policy and value function from a large prior dataset of other robots navigating in other environments (at low speed), which provides a navigation-relevant representation. From here, a sample-efficient online RL method uses a single low-speed user-provided demonstration to determine the desired driving course, extracts a set of navigational checkpoints, and autonomously practices driving through these checkpoints, resetting automatically on collision or failure. Perhaps surprisingly, we find that with appropriate initialization and choice of algorithm, our system can learn to drive over a variety of racing courses with less than 20 minutes of online training. The resulting policies exhibit emergent aggressive driving skills, such as timing braking and acceleration around turns and avoiding areas which impede the robot's motion, approaching the performance of a human driver using a similar first-person interface over the course of training.

翻译：我们提出了一个系统，使一辆自主小型遥控车能够从视觉观察中依靠强化学习（RL）主动驾驶。我们的系统FastRLAP（更快的圈速）在真实世界中自主训练，不需要人工干预，也不需要任何仿真或专家演示。我们的系统集成了许多重要的组件，使这一切成为可能：我们从大量先前的机器人在其他环境下导航（低速行驶）的数据集中初始化RL策略和值函数的表示，这提供了一个与导航相关的表示。接下来，使用一次低速用户提供的演示，样本高效的在线RL方法确定所需的驾驶路线，提取一组导航检查点，并自主练习通过这些检查点驾驶，自动在碰撞或失败时进行重置。也许有些出乎意料地，我们发现，通过适当的初始化和算法选择，我们的系统可以学习在少于20分钟的在线训练时间内驾驶各种赛道。由此产生的策略表现出新兴的侵略驾驶技能，例如在拐角处的定时刹车和加速以及避免妨碍机器人行动的区域，随着训练的进行，接近使用类似的第一人称接口的人类驾驶者的表现水平。

0

相关内容

检查点

【Facebook-Ishan Mishra】计算机视觉自监督学习，92页ppt

专知会员服务

36+阅读 · 2021年7月7日

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

专知会员服务

37+阅读 · 2020年5月9日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

深度强化学习入门，这一篇就够了！

深度强化学习入门，这一篇就够了！

机器学习算法与Python学习

28+阅读 · 2018年8月17日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

DRD2-Ca2+信号通路对PTSD所致学习记忆障碍的调控作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

ADS中检测快中子束的GEM探测器的研制

国家自然科学基金

0+阅读 · 2013年12月31日

短距离30GHz/60GHz双频段高速无线通信收发系统片上集成关键技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

视觉系统学习和适应的计算模型

国家自然科学基金

1+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

微小移动机器人视觉伺服自主作业模式与行为模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

农业机械智能导航系统多传感器信息融合模式与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于机器学习的惯性导航系统初始对准方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Curvature and complexity: Better lower bounds for geodesically convex optimization

Arxiv

0+阅读 · 2023年6月5日

Bridging the Domain Gap between Synthetic and Real-World Data for Autonomous Driving

Arxiv

0+阅读 · 2023年6月5日

MANSA: Learning Fast and Slow in Multi-Agent Systems

Arxiv

0+阅读 · 2023年6月4日

Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction

Arxiv

0+阅读 · 2023年6月3日

SACSoN: Scalable Autonomous Data Collection for Social Navigation

Arxiv

0+阅读 · 2023年6月2日

Improving and Benchmarking Offline Reinforcement Learning Algorithms

Arxiv

0+阅读 · 2023年6月1日

Learning Sampling Dictionaries for Efficient and Generalizable Robot Motion Planning with Transformers

Arxiv

0+阅读 · 2023年6月1日

Efficient Deep Learning of Robust Policies from MPC using Imitation and Tube-Guided Data Augmentation

Arxiv

0+阅读 · 2023年6月1日

Customized Co-Simulation Environment for Autonomous Driving Algorithm Development and Evaluation

Arxiv

0+阅读 · 2023年5月31日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

VIP会员

文章信息

相关主题

相关VIP内容

【Facebook-Ishan Mishra】计算机视觉自监督学习，92页ppt

专知会员服务

36+阅读 · 2021年7月7日

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

【Google】大迁移：通用视觉表示学习，General Visual Representation Learning

专知会员服务

37+阅读 · 2020年5月9日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

深度强化学习入门，这一篇就够了！

深度强化学习入门，这一篇就够了！

机器学习算法与Python学习

28+阅读 · 2018年8月17日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Curvature and complexity: Better lower bounds for geodesically convex optimization

Arxiv

0+阅读 · 2023年6月5日

Bridging the Domain Gap between Synthetic and Real-World Data for Autonomous Driving

Arxiv

0+阅读 · 2023年6月5日

MANSA: Learning Fast and Slow in Multi-Agent Systems

Arxiv

0+阅读 · 2023年6月4日

Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction

Arxiv

0+阅读 · 2023年6月3日

SACSoN: Scalable Autonomous Data Collection for Social Navigation

Arxiv

0+阅读 · 2023年6月2日

Improving and Benchmarking Offline Reinforcement Learning Algorithms

Arxiv

0+阅读 · 2023年6月1日

Learning Sampling Dictionaries for Efficient and Generalizable Robot Motion Planning with Transformers

Arxiv

0+阅读 · 2023年6月1日

Efficient Deep Learning of Robust Policies from MPC using Imitation and Tube-Guided Data Augmentation

Arxiv

0+阅读 · 2023年6月1日

Customized Co-Simulation Environment for Autonomous Driving Algorithm Development and Evaluation

Arxiv

0+阅读 · 2023年5月31日

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

Arxiv

19+阅读 · 2022年5月13日

相关基金

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

DRD2-Ca2+信号通路对PTSD所致学习记忆障碍的调控作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

ADS中检测快中子束的GEM探测器的研制

国家自然科学基金

0+阅读 · 2013年12月31日

短距离30GHz/60GHz双频段高速无线通信收发系统片上集成关键技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

视觉系统学习和适应的计算模型

国家自然科学基金

1+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于"非监督-监督-激励"集成学习模式的机器人行为自主学习系统研究

国家自然科学基金

1+阅读 · 2010年12月31日

微小移动机器人视觉伺服自主作业模式与行为模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

农业机械智能导航系统多传感器信息融合模式与方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于机器学习的惯性导航系统初始对准方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员