DFIX: 检测和解决失败情况,在模拟学习中加强学习,自主驾驶 (DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving) - 专知论文

会员服务 ·

0

Agent · Learning · 强化学习 · Continuity · state-of-the-art ·

2022 年 10 月 29 日

DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving

翻译：DFIX: 检测和解决失败情况,在模拟学习中加强学习,自主驾驶

Resul Dagdanov,Feyza Eksen,Halil Durmus,Ferhat Yurdakul,Nazim Kemal Ure

from arxiv, 6 pages, 4 figures, 2 tables, published in IEEE International Conference on Intelligent Transportation Systems (ITSC), October 12, 2022, Macau, China

Safely navigating through an urban environment without violating any traffic rules is a crucial performance target for reliable autonomous driving. In this paper, we present a Reinforcement Learning (RL) based methodology to DEtect and FIX (DeFIX) failures of an Imitation Learning (IL) agent by extracting infraction spots and re-constructing mini-scenarios on these infraction areas to train an RL agent for fixing the shortcomings of the IL approach. DeFIX is a continuous learning framework, where extraction of failure scenarios and training of RL agents are executed in an infinite loop. After each new policy is trained and added to the library of policies, a policy classifier method effectively decides on which policy to activate at each step during the evaluation. It is demonstrated that even with only one RL agent trained on failure scenario of an IL agent, DeFIX method is either competitive or does outperform state-of-the-art IL and RL based autonomous urban driving benchmarks. We trained and validated our approach on the most challenging map (Town05) of CARLA simulator which involves complex, realistic, and adversarial driving scenarios. The source code is publicly available at https://github.com/data-and-decision-lab/DeFIX

翻译：在不违反任何交通规则的情况下安全地穿过城市环境,这是可靠自主驾驶的一个关键性业绩目标。在本文件中,我们介绍了基于强化学习方法(RL)的可靠自主驾驶。在每项新政策经过培训并添加到政策库后,政策分类方法有效地决定了在评估的每一步中,哪一种政策启动政策,通过提取折射点和重新构建关于这些违规地区的小型假想,来培训一个RL代理,以弥补IL方法的缺陷。DeFIX是一个持续学习框架,在这个框架中,从失败情景中提取和对RL代理进行培训是无限循环执行的。在每项新政策都经过培训并添加到政策库中,政策分类方法有效地决定了在评估的每一步骤中启动哪一种政策。事实证明,即使只有一名RL代理接受了关于IL代理失败情景的培训,DFIX方法要么是竞争性的,要么是超越了基于IL和RL自主城市驾驶基准的状态。我们培训和验证了我们在最具挑战性的地图上的方法(Town05),在CARA模拟器模拟器模拟器中,其中涉及复杂、现实性和敌对性/敌对性驱动情景。

0

相关内容

Agent

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

凋亡诱导因子AIF调控Wnt信号通路的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

氧化低密度脂蛋白（ox-LDL）诱导的组织蛋白酶S（Cat S）高表达在树突状细胞（DCs）DCs激活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类带误差模型密度函数导数的小波最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

α-酮己二酰-7-氨基头孢烷酸酰化酶的定向进化研究

国家自然科学基金

0+阅读 · 2012年12月31日

仿射技巧在复几何的应用

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs调控移植静脉新生内膜增生的力学生物学机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

用自旋极化扫描隧道显微镜研究3d族磁性超薄膜的自旋结构

国家自然科学基金

0+阅读 · 2009年12月31日

Near-optimal Policy Identification in Active Reinforcement Learning

Arxiv

0+阅读 · 2022年12月19日

Reinforcement Learning for Agile Active Target Sensing with a UAV

Arxiv

0+阅读 · 2022年12月16日

To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

Arxiv

0+阅读 · 2022年12月15日

Driver Assistance Eco-driving and Transmission Control with Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年12月15日

Robust Policy Optimization in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年12月14日

Improving generalization in reinforcement learning through forked agents

Arxiv

0+阅读 · 2022年12月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Near-optimal Policy Identification in Active Reinforcement Learning

Arxiv

0+阅读 · 2022年12月19日

Reinforcement Learning for Agile Active Target Sensing with a UAV

Arxiv

0+阅读 · 2022年12月16日

To Explore or Not to Explore: Regret-Based LTL Planning in Partially-Known Environments

Arxiv

0+阅读 · 2022年12月15日

Driver Assistance Eco-driving and Transmission Control with Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年12月15日

Robust Policy Optimization in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年12月14日

Improving generalization in reinforcement learning through forked agents

Arxiv

0+阅读 · 2022年12月14日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

凋亡诱导因子AIF调控Wnt信号通路的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

氧化低密度脂蛋白（ox-LDL）诱导的组织蛋白酶S（Cat S）高表达在树突状细胞（DCs）DCs激活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类带误差模型密度函数导数的小波最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

α-酮己二酰-7-氨基头孢烷酸酰化酶的定向进化研究

国家自然科学基金

0+阅读 · 2012年12月31日

仿射技巧在复几何的应用

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs调控移植静脉新生内膜增生的力学生物学机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

用自旋极化扫描隧道显微镜研究3d族磁性超薄膜的自旋结构

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员