通过不同语义制图和规划进行自主导航反向强化学习 (Inverse reinforcement learning for autonomous navigation via differentiable semantic mapping and planning) - 专知论文

会员服务 ·

0

逆强化学习 · 学成 · 代价 · 强化学习 · MoDELS ·

2021 年 1 月 1 日

Inverse reinforcement learning for autonomous navigation via differentiable semantic mapping and planning

翻译：通过不同语义制图和规划进行自主导航反向强化学习

Tianyu Wang,Vikas Dhiman,Nikolay Atanasov

from arxiv, 16 pages, 12 figures. arXiv admin note: text overlap with arXiv:2006.05043

This paper focuses on inverse reinforcement learning for autonomous navigation using distance and semantic category observations. The objective is to infer a cost function that explains demonstrated behavior while relying only on the expert's observations and state-control trajectory. We develop a map encoder, that infers semantic category probabilities from the observation sequence, and a cost encoder, defined as a deep neural network over the semantic features. Since the expert cost is not directly observable, the model parameters can only be optimized by differentiating the error between demonstrated controls and a control policy computed from the cost estimate. We propose a new model of expert behavior that enables error minimization using a closed-form subgradient computed only over a subset of promising states via a motion planning algorithm. Our approach allows generalizing the learned behavior to new environments with new spatial configurations of the semantic categories. We analyze the different components of our model in a minigrid environment. We also demonstrate that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of buildings, sidewalks, and road lanes.

翻译：本文侧重于利用远程和语义类观测进行自主导航的反强化学习。目的是推断一个成本函数, 解释所显示的行为, 而仅依靠专家的观察和州控制轨迹。我们开发了一个地图编码器, 从观测序列中推断出语义类别概率, 以及一个成本编码器, 定义为在语义特征上建立深神经网络。由于专家成本无法直接观察, 模型参数只能通过区分显示的控制与根据成本估算计算的控制政策之间的错误来优化。我们提出了一个新的专家行为模式, 使用封闭式子梯度, 只能通过运动规划算法对有希望的一组国家进行计算, 使错误最小化。我们的方法允许将所学过的行为推广到新的环境, 并且对语义分类类别进行新的空间配置。我们分析了我们模型在微型电网环境中的不同组成部分。我们还表明, 我们的方法通过依靠建筑、侧行行道和路道的语义观测, 来学习在自动驱动 CARLA 模拟器中遵循交通规则。

0

相关内容

逆强化学习

逆强化学习

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

专知会员服务

87+阅读 · 2021年1月11日

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

专知会员服务

46+阅读 · 2020年7月22日

【CMU-Google-斯坦福】可控行为的弱监督强化学习，Weakly-Supervised RL

【CMU-Google-斯坦福】可控行为的弱监督强化学习，Weakly-Supervised RL

专知会员服务

22+阅读 · 2020年4月8日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【泡泡一分钟】通过学习轮式里程计和IMU误差的定位

【泡泡一分钟】通过学习轮式里程计和IMU误差的定位

泡泡机器人SLAM

133+阅读 · 2019年9月12日

ICML2019：Google和Facebook在推进哪些方向？

ICML2019：Google和Facebook在推进哪些方向？

中国人工智能学会

5+阅读 · 2019年6月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Adversarial Environment Generation for Learning to Navigate the Web

Arxiv

0+阅读 · 2021年3月2日

Exploring Imitation Learning for Autonomous Driving with Feedback Synthesizer and Differentiable Rasterization

Exploring Imitation Learning for Autonomous Driving with Feedback Synthesizer and Differentiable Rasterization

Arxiv

0+阅读 · 2021年3月2日

Local Navigation and Docking of an Autonomous Robot Mower using Reinforcement Learning and Computer Vision

Arxiv

0+阅读 · 2021年3月2日

MOReL : Model-Based Offline Reinforcement Learning

Arxiv

0+阅读 · 2021年3月2日

Diverse Critical Interaction Generation for Planning and Planner Evaluation

Arxiv

0+阅读 · 2021年3月1日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

Arxiv

6+阅读 · 2018年6月7日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

VIP会员

文章信息

相关主题

逆强化学习

相关VIP内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

【经典书】机器学习白话书，97页pdf，Machine Learning for Humans

专知会员服务

87+阅读 · 2021年1月11日

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

【Manning新书】微服务安全实战，616页pdf，Microservices Security in Action

专知会员服务

46+阅读 · 2020年7月22日

【CMU-Google-斯坦福】可控行为的弱监督强化学习，Weakly-Supervised RL

【CMU-Google-斯坦福】可控行为的弱监督强化学习，Weakly-Supervised RL

专知会员服务

22+阅读 · 2020年4月8日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

【CVPR 2019 | tutorial】自主汽车的感知、预测和大规模数据采集：Perception, Prediction, and Large Scale Data Collection for Autonomous Cars

专知会员服务

33+阅读 · 2019年11月28日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

【泡泡一分钟】通过学习轮式里程计和IMU误差的定位

【泡泡一分钟】通过学习轮式里程计和IMU误差的定位

泡泡机器人SLAM

133+阅读 · 2019年9月12日

ICML2019：Google和Facebook在推进哪些方向？

ICML2019：Google和Facebook在推进哪些方向？

中国人工智能学会

5+阅读 · 2019年6月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Adversarial Environment Generation for Learning to Navigate the Web

Arxiv

0+阅读 · 2021年3月2日

Exploring Imitation Learning for Autonomous Driving with Feedback Synthesizer and Differentiable Rasterization

Exploring Imitation Learning for Autonomous Driving with Feedback Synthesizer and Differentiable Rasterization

Arxiv

0+阅读 · 2021年3月2日

Local Navigation and Docking of an Autonomous Robot Mower using Reinforcement Learning and Computer Vision

Arxiv

0+阅读 · 2021年3月2日

MOReL : Model-Based Offline Reinforcement Learning

Arxiv

0+阅读 · 2021年3月2日

Diverse Critical Interaction Generation for Planning and Planner Evaluation

Arxiv

0+阅读 · 2021年3月1日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Arxiv

8+阅读 · 2018年7月10日

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

Arxiv

6+阅读 · 2018年6月7日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

微信扫码咨询专知VIP会员