Thalsim: 学习模拟现实主义多代理行为 (TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors) - 专知论文

会员服务 ·

0

学成 · 多样性 · 潜变量/隐变量 · MoDELS · INTERACT ·

2021 年 1 月 17 日

TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

翻译：Thalsim: 学习模拟现实主义多代理行为

Simon Suo,Sebastian Regalado,Sergio Casas,Raquel Urtasun

Simulation has the potential to massively scale evaluation of self-driving systems enabling rapid development as well as safe deployment. To close the gap between simulation and the real world, we need to simulate realistic multi-agent behaviors. Existing simulation environments rely on heuristic-based models that directly encode traffic rules, which cannot capture irregular maneuvers (e.g., nudging, U-turns) and complex interactions (e.g., yielding, merging). In contrast, we leverage real-world data to learn directly from human demonstration and thus capture a more diverse set of actor behaviors. To this end, we propose TrafficSim, a multi-agent behavior model for realistic traffic simulation. In particular, we leverage an implicit latent variable model to parameterize a joint actor policy that generates socially-consistent plans for all actors in the scene jointly. To learn a robust policy amenable for long horizon simulation, we unroll the policy in training and optimize through the fully differentiable simulation across time. Our learning objective incorporates both human demonstrations as well as common sense. We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines. Notably, we can exploit trajectories generated by TrafficSim as effective data augmentation for training better motion planner.

翻译：模拟有可能大规模地评估自我驾驶系统,从而能够快速发展和安全地部署。为了缩小模拟与现实世界之间的差距,我们需要模拟现实多剂行为。现有的模拟环境依赖于直接编码交通规则的基于疲劳的模型,这些模型无法捕捉非常规动作(如裸体、Uturns)和复杂的相互作用(如收成、合并)。相比之下,我们利用真实世界数据直接从人类演示中学习,从而捕捉出一套更加多样化的行为者行为。为此,我们提出TeleSim,这是一个用于现实交通模拟的多剂行为模型。特别是,我们利用隐含的潜在变异模型,将一个联合行为者政策参数化,为现场所有行为者共同制定社会一致的计划。要学习一种适合长视野模拟的强有力政策,我们通过完全不同的模拟,在培训和优化中引入政策。我们的学习目标既包括人类演示,又包括一套共同感官。我们展示StraSim能够大大地产生更加现实和多样化的交通假设情景,而比一个多样化的移动性模型能产生更好的稳定度基线。我们通过不同的数据来利用。

0

相关内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

专知会员服务

142+阅读 · 2020年4月30日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

无人机

29+阅读 · 2019年5月2日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Modelling Human Kinetics and Kinematics during Walking using Reinforcement Learning

Arxiv

0+阅读 · 2021年3月15日

Simulation Studies on Deep Reinforcement Learning for Building Control with Human Interaction

Arxiv

0+阅读 · 2021年3月14日

Error-Aware Policy Learning: Zero-Shot Generalization in Partially Observable Dynamic Environments

Arxiv

0+阅读 · 2021年3月13日

Mean Field Behaviour of Collaborative Multi-Agent Foragers

Arxiv

0+阅读 · 2021年3月13日

RLSS: Real-time Multi-Robot Trajectory Replanning using Linear Spatial Separations

Arxiv

0+阅读 · 2021年3月13日

Dreaming: Model-based Reinforcement Learning by Latent Imagination without Reconstruction

Arxiv

0+阅读 · 2021年3月12日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Learning Recommender Systems from Multi-Behavior Data

Learning Recommender Systems from Multi-Behavior Data

Arxiv

8+阅读 · 2018年9月21日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

潜变量/隐变量

相关VIP内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

专知会员服务

142+阅读 · 2020年4月30日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】无监督物体学习（Unsupervised Object Learning）

《对美国国防部庞大创新生态系统进行组织网络分析》119页

更快地运转OODA循环：人工智能如何助力军事决策更迅捷、更优化？

生成式增强现实：范式、技术与未来应用

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

基于 Carsim 2016 和 Simulink的无人车运动控制联合仿真（一）

无人机

29+阅读 · 2019年5月2日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Modelling Human Kinetics and Kinematics during Walking using Reinforcement Learning

Arxiv

0+阅读 · 2021年3月15日

Simulation Studies on Deep Reinforcement Learning for Building Control with Human Interaction

Arxiv

0+阅读 · 2021年3月14日

Error-Aware Policy Learning: Zero-Shot Generalization in Partially Observable Dynamic Environments

Arxiv

0+阅读 · 2021年3月13日

Mean Field Behaviour of Collaborative Multi-Agent Foragers

Arxiv

0+阅读 · 2021年3月13日

RLSS: Real-time Multi-Robot Trajectory Replanning using Linear Spatial Separations

Arxiv

0+阅读 · 2021年3月13日

Dreaming: Model-based Reinforcement Learning by Latent Imagination without Reconstruction

Arxiv

0+阅读 · 2021年3月12日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Learning Recommender Systems from Multi-Behavior Data

Learning Recommender Systems from Multi-Behavior Data

Arxiv

8+阅读 · 2018年9月21日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

微信扫码咨询专知VIP会员