学习强力多剂驱动政策,减少交通堵塞 (Learning a Robust Multiagent Driving Policy for Traffic Congestion Reduction)

The advent of automated and autonomous vehicles (AVs) creates opportunities to achieve system-level goals using multiple AVs, such as traffic congestion reduction. Past research has shown that multiagent congestion-reducing driving policies can be learned in a variety of simulated scenarios. While initial proofs of concept were in small, closed traffic networks with a centralized controller, recently successful results have been demonstrated in more realistic settings with distributed control policies operating in open road networks where vehicles enter and leave. However, these driving policies were mostly tested under the same conditions they were trained on, and have not been thoroughly tested for robustness to different traffic conditions, which is a critical requirement in real-world scenarios. This paper presents a learned multiagent driving policy that is robust to a variety of open-network traffic conditions, including vehicle flows, the fraction of AVs in traffic, AV placement, and different merging road geometries. A thorough empirical analysis investigates the sensitivity of such a policy to the amount of AVs in both a simple merge network and a more complex road with two merging ramps. It shows that the learned policy achieves significant improvement over simulated human-driven policies even with AV penetration as low as 2%. The same policy is also shown to be capable of reducing traffic congestion in more complex roads with two merging ramps.

翻译：自动化和自主车辆(AVs)的出现创造了机会,利用多种AV(如减少交通拥堵)实现系统目标,如减少交通拥堵等。过去的研究显示,在各种模拟情景中,可以学习多剂减少交通拥堵的驾驶政策。概念的初步证明存在于小型、封闭的交通网络中,由中央控制器负责,但最近的成功结果在更现实的环境下已经证明,在车辆进出的开放公路网络中实行分散的控制政策;然而,这些驾驶政策大多在经过训练的相同条件下进行测试,而且没有经过彻底测试,以适应不同的交通条件,这是现实世界情景中的一个关键要求。本文介绍了一项经过学习的多剂驾驶政策,该政策对各种开放网络的交通条件十分健全,包括车辆流动、交通中的AV的一小部分、AV的布置和不同的合并道路地貌等。一项透彻的经验分析调查调查了这种政策对车辆在简单合并网络中和两条合并的更复杂的道路中的数量的敏感度。它表明,所学习的政策在模拟的人类驱动政策方面取得了显著的改进,即使AV的交通堵路的深度缩小了2号,也表现为低。