利用基于深强化学习的代理行为模型,在多代理交通模拟中产生交通流量 (Generation of Traffic Flows in Multi-Agent Traffic Simulation with Agent Behavior Model based on Deep Reinforcement Learning)

In multi-agent based traffic simulation, agents are always supposed to move following existing instructions, and mechanically and unnaturally imitate human behavior. The human drivers perform acceleration or deceleration irregularly all the time, which seems unnecessary in some conditions. For letting agents in traffic simulation behave more like humans and recognize other agents' behavior in complex conditions, we propose a unified mechanism for agents learn to decide various accelerations by using deep reinforcement learning based on a combination of regenerated visual images revealing some notable features, and numerical vectors containing some important data such as instantaneous speed. By handling batches of sequential data, agents are enabled to recognize surrounding agents' behavior and decide their own acceleration. In addition, we can generate a traffic flow behaving diversely to simulate the real traffic flow by using an architecture of fully decentralized training and fully centralized execution without violating Markov assumptions.

翻译：在基于多试剂的交通模拟中,代理商总是应该按照现有指示以及机械和非自然地模仿人类行为而移动。人类驱动器总是不定期地加速或减速, 在某些条件下似乎没有必要。为了让运输模拟代理商更像人类, 并承认其他代理商在复杂条件下的行为, 我们提议一个统一的机制, 代理商学习如何决定各种加速, 方法是在一系列重新生成的显示某些显著特征的视觉图像和包含某些重要数据( 如瞬时速度)的数字矢量的组合基础上, 使用深度强化学习方法。通过处理一系列相继数据, 代理商能够识别周围的代理商行为并决定自己的加速度。此外, 我们可以产生一种不同的交通流, 来模拟真实的交通流动, 使用完全分散的培训和完全集中的处决结构, 而不违反 Markov 假设。

相关内容

深度强化学习

关注 156

深度强化学习 (DRL) 是一种使用深度学习技术扩展传统强化学习方法的一种机器学习方法。传统强化学习方法的主要任务是使得主体根据从环境中获得的奖赏能够学习到最大化奖赏的行为。然而，传统无模型强化学习方法需要使用函数逼近技术使得主体能够学习出值函数或者策略。在这种情况下，深度学习强大的函数逼近能力自然成为了替代人工指定特征的最好手段并为性能更好的端到端学习的实现提供了可能。

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

深度学习搜索，Exploring Deep Learning for Search

专知会员服务

61+阅读 · 2020年5月9日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日