通过反强化学习进行客观认知的交通模拟 (Objective-aware Traffic Simulation via Inverse Reinforcement Learning)

Traffic simulators act as an essential component in the operating and planning of transportation systems. Conventional traffic simulators usually employ a calibrated physical car-following model to describe vehicles' behaviors and their interactions with traffic environment. However, there is no universal physical model that can accurately predict the pattern of vehicle's behaviors in different situations. A fixed physical model tends to be less effective in a complicated environment given the non-stationary nature of traffic dynamics. In this paper, we formulate traffic simulation as an inverse reinforcement learning problem, and propose a parameter sharing adversarial inverse reinforcement learning model for dynamics-robust simulation learning. Our proposed model is able to imitate a vehicle's trajectories in the real world while simultaneously recovering the reward function that reveals the vehicle's true objective which is invariant to different dynamics. Extensive experiments on synthetic and real-world datasets show the superior performance of our approach compared to state-of-the-art methods and its robustness to variant dynamics of traffic.

翻译：常规交通模拟器通常使用校准的物理汽车跟踪模型来描述车辆的行为及其与交通环境的相互作用,然而,没有通用的物理模型可以准确预测不同情况下车辆行为模式。鉴于交通动态的非静止性质,固定的物理模型在复杂的环境中往往不那么有效。在本文中,我们将交通模拟作为反强化学习问题,并提议一个参数共享对抗性强化学习模型,用于动态机器人模拟学习。我们提议的模型能够模仿现实世界中的车辆轨迹,同时恢复显示车辆真实目标的奖励功能,该功能显示车辆真实目标与不同动态不相容。关于合成和真实世界数据集的广泛实验显示我们的方法与最新方法相比的优异性及其与不同交通动态的强性。